当前位置：首页 > news >正文

PiscCode实现MediaPipe 的人体姿态识别：三屏可视化对比实现

news 2025/8/22 14:30:22

一、前言

人体姿态识别（Pose Estimation）是计算机视觉中的一个重要方向，它通过检测人体的关键点位置（如鼻子、肩膀、肘部、膝盖、脚踝等），帮助我们理解人体的动作和姿势。
常见应用包括：

健身动作纠正
体育训练辅助
动作捕捉（游戏、动画）
安防与人机交互

Google 开源的 MediaPipe 提供了强大的姿态识别模型，支持实时检测人体 33 个关键点。

今天这篇文章，我们将基于 MediaPipe Pose Landmarker，实现一个“三屏对比”效果：

左侧：原始图像
中间：骨架图（仅关键点和连线）
右侧：骨架叠加在原始图像上

效果如下（示意图）：

这样我们可以更直观地对比识别结果，便于调试与演示。

二、环境准备

同时，还需要下载 MediaPipe 官方提供的 Pose 模型文件（如 pose_landmarker_heavy.task），放在本地目录下。

三、核心代码解析

我们来看主要的类：PoseObjectDIYTriple

1. 初始化模型

base_options = python.BaseOptions(model_asset_path=model_path)
options = vision.PoseLandmarkerOptions(base_options=base_options,num_poses=num_poses,running_mode=vision.RunningMode.IMAGE
)
self.detector = vision.PoseLandmarker.create_from_options(options)

这里通过 vision.PoseLandmarkerOptions 配置模型参数：

model_path：模型路径
num_poses：最大检测人数（默认 1）
running_mode=IMAGE：以单张图片模式运行

最终得到 self.detector，它是一个可调用的姿态检测器。

2. 定义关键点颜色

为了让不同关键点更易区分，我们给每个点定义了颜色（BGR 格式）：

self.landmark_colors = {0: (0, 0, 255),   # 鼻子 红色7: (255, 0, 0),   # 左耳 蓝色11: (255, 165, 0),# 左肩 橙色13: (255, 0, 255),# 左肘 紫色27: (0, 128, 128),# 左脚踝 青色...
}

这样骨架在画面中会非常直观。

3. 绘制骨架

核心绘制函数 _draw_skeleton：

def _draw_skeleton(self, frame, pose_landmarks, draw_points=True):skeleton_img = np.zeros_like(frame)h, w, _ = frame.shape# 画骨架连线for start_idx, end_idx in self.connections:x1, y1 = int(pose_landmarks[start_idx].x * w), int(pose_landmarks[start_idx].y * h)x2, y2 = int(pose_landmarks[end_idx].x * w), int(pose_landmarks[end_idx].y * h)cv2.line(skeleton_img, (x1, y1), (x2, y2), self.line_color, self.line_thickness)# 画关键点if draw_points:for idx, lm in enumerate(pose_landmarks):cx, cy = int(lm.x * w), int(lm.y * h)color = self.landmark_colors.get(idx, (255, 255, 255))cv2.circle(skeleton_img, (cx, cy), self.point_size, color, -1)return skeleton_img

要点：

每个 landmark 的 x,y 坐标都是 归一化的 (0-1)，需要乘以图片的宽高才能得到像素坐标。
使用 cv2.line 画连线，cv2.circle 画关键点。

4. 三屏拼接

最后的 do 方法把三幅画面拼接在一起：

def do(self, frame):mp_image = mp.Image(image_format=mp.ImageFormat.SRGB,data=cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))detection_result = self.detector.detect(mp_image)# 初始化中间和右侧帧skeleton_only = np.zeros_like(frame)skeleton_overlay = np.zeros_like(frame)if detection_result.pose_landmarks:for pose_landmarks in detection_result.pose_landmarks:# 仅骨架skeleton_only = self._draw_skeleton(skeleton_only, pose_landmarks, draw_points=True)# 骨架叠加skeleton_overlay = self._draw_skeleton(np.zeros_like(frame), pose_landmarks, draw_points=True)skeleton_overlay = cv2.addWeighted(frame, 1.0, skeleton_overlay, 1.0, 0)# 横向拼接三张图triple_frame = np.concatenate([frame, skeleton_only, skeleton_overlay], axis=1)return triple_frame

这样就能得到一个三屏对比画面，非常直观！

四、运行示例

import cv2
import numpy as np
import mediapipe as mp
from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2
from mediapipe.tasks import python
from mediapipe.tasks.python import visionclass PoseObjectDIYTriple:def __init__(self,model_path="文件地址/pose_landmarker_heavy.task",num_poses=1,point_size=10,line_thickness=6,landmark_colors=None,line_color=(255, 255, 255)):"""初始化 Mediapipe PoseLandmarker 自定义版（三屏对比）"""base_options = python.BaseOptions(model_asset_path=model_path)options = vision.PoseLandmarkerOptions(base_options=base_options,num_poses=num_poses,running_mode=vision.RunningMode.IMAGE)self.detector = vision.PoseLandmarker.create_from_options(options)self.point_size = point_sizeself.line_thickness = line_thicknessself.line_color = line_colorself.connections = solutions.pose.POSE_CONNECTIONS# 默认颜色方案（左右对应部位颜色一致）if landmark_colors is None:self.landmark_colors = {0: (0, 0, 255),      # nose1: (0, 255, 0), 2: (0, 255, 0), 3: (0, 255, 0),4: (0, 255, 0), 5: (0, 255, 0), 6: (0, 255, 0),7: (255, 0, 0), 8: (255, 0, 0),9: (0, 255, 255), 10: (0, 255, 255),11: (255, 165, 0), 12: (255, 165, 0),13: (255, 0, 255), 14: (255, 0, 255),15: (0, 128, 255), 16: (0, 128, 255),17: (128, 0, 128), 18: (128, 0, 128),19: (0, 128, 0), 20: (0, 128, 0),21: (128, 128, 0), 22: (128, 128, 0),23: (0, 0, 128), 24: (0, 0, 128),25: (128, 0, 0), 26: (128, 0, 0),27: (0, 128, 128), 28: (0, 128, 128),29: (128, 128, 128), 30: (128, 128, 128),31: (0, 0, 0), 32: (0, 0, 0)}else:self.landmark_colors = landmark_colorsdef _draw_skeleton(self, frame, pose_landmarks, draw_points=True):"""绘制骨架，支持自定义点和线"""skeleton_img = np.zeros_like(frame)h, w, _ = frame.shape# 画骨架连线for start_idx, end_idx in self.connections:x1, y1 = int(pose_landmarks[start_idx].x * w), int(pose_landmarks[start_idx].y * h)x2, y2 = int(pose_landmarks[end_idx].x * w), int(pose_landmarks[end_idx].y * h)cv2.line(skeleton_img, (x1, y1), (x2, y2), self.line_color, self.line_thickness)# 画关键点if draw_points:for idx, lm in enumerate(pose_landmarks):cx, cy = int(lm.x * w), int(lm.y * h)color = self.landmark_colors.get(idx, (255, 255, 255))cv2.circle(skeleton_img, (cx, cy), self.point_size, color, -1)return skeleton_imgdef do(self, frame,device):"""生成三屏对比帧"""if frame is None:return Nonemp_image = mp.Image(image_format=mp.ImageFormat.SRGB,data=cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))detection_result = self.detector.detect(mp_image)h, w, _ = frame.shape# 初始化中间和右侧帧skeleton_only = np.zeros_like(frame)skeleton_overlay = np.zeros_like(frame)if detection_result.pose_landmarks:for pose_landmarks in detection_result.pose_landmarks:skeleton_only = self._draw_skeleton(skeleton_only, pose_landmarks, draw_points=True)skeleton_overlay = self._draw_skeleton(np.zeros_like(frame), pose_landmarks, draw_points=True)skeleton_overlay = cv2.addWeighted(frame, 1.0, skeleton_overlay, 1.0, 0)# 拼接三屏triple_frame = np.concatenate([frame, skeleton_only, skeleton_overlay], axis=1)return triple_frame

运行后即可看到：

左边：原始视频
中间：骨架画面
右边：骨架叠加在原始视频上

垂直布局可改为

        triple_frame = np.concatenate([frame, skeleton_only, skeleton_overlay], axis=0)

五、总结与拓展

这篇文章，我们实现了一个基于 MediaPipe Pose Landmarker 的 三屏对比工具，它能够：
✅ 可视化人体姿态识别结果
✅ 对比原始图像与骨架效果
✅ 便于调试模型识别效果

未来你可以进一步拓展：

竖屏拼接：改成上下排列三屏
多人检测：设置 num_poses > 1
动作识别：基于关键点坐标，识别举手、深蹲、跑步等动作
实时健身指导：与标准动作对比，给出提示

对 PiscTrace or PiscCode感兴趣？更多精彩内容请移步官网看看～🔗 PiscTrace

查看全文

http://www.dtcms.com/a/343500.html

算法题Day4

WaitForSingleObject函数详解

JavaScript 性能优化实战技术文章大纲

C++手撕LRU

中国之路向善而行第三届全国自驾露营旅游发展大会在阿拉善启幕

Webpack的使用

5.Shell脚本修炼手册---Linux正则表达式(Shell三剑客准备启动阶段)

AI 时代的 “人机协作”：人类与 AI 如何共塑新生产力

7.Shell脚本修炼手册---awk基础入门版

camel中支持的模型与工具

爬虫基础学习-POST方式、自定义User-Agent

FCN网络结构讲解与Pytorch逐行讲解实现

小程序个人信息安全检测技术：从监管视角看加密与传输合规

限流技术：从四大限流算法到Redisson令牌桶实践

SpringBoot整合HikariCP数据库连接池

机器学习聚类算法

【机器学习】线性回归

深入解析C++非类型模板参数

Linux入门DAY29

AI 产业落地：从 “实验室神话” 到 “车间烟火气” 的跨越

【TrOCR】模型预训练权重各个文件解读

SpringAI1.0.1实战教程：避坑指南25年8月最新版

近端策略优化算法PPO的核心概念和PyTorch实现详解

Typescript入门-函数讲解

创建一个springboot starter页面

LG P2617 Dynamic Rankings Solution

1688 商品详情接口数据全解析（1688.item_get）

关于从零开始写一个TEE OS

如何安装 VMware Workstation 17.5.1？超简单步骤（附安装包下载）

Building Systems with the ChatGPT API 使用 ChatGPT API 搭建系统(第四章学习笔记及总结)