PiscCode集成Hand Landmarker:实现高精度手部姿态检测与分析
技术概述
手部姿态检测是计算机视觉领域的前沿技术,能够实时追踪和识别手部的精细动作。MediaPipe Hand Landmarker作为Google开发的高效手部关键点检测模型,为手势识别和人机交互应用提供了强大的技术基础。
核心技术原理
模型架构
MediaPipe Hand Landmarker采用了两阶段检测架构:
-
手掌检测器:首先使用BlazePalm模型快速定位图像中的手掌区域
-
手部关键点检测:在检测到的手掌区域内精确定位21个3D手部关键点
这种分工策略既保证了检测速度,又确保了关键点定位的准确性。
关键点定义
模型检测的21个手部关键点包括:
-
0: 手腕
-
1-4: 拇指各关节
-
5-8: 食指各关节
-
9-12: 中指各关节
-
13-16: 无名指各关节
-
17-20: 小指各关节
每个关键点都包含(x, y, z)三维坐标信息,其中z值表示深度信息。
代码实现详解
1. 初始化HandLandmarker
def __init__(self, model_path="path/to/hand_landmarker.task", num_hands=2):base_options = python.BaseOptions(model_asset_path=model_path)options = vision.HandLandmarkerOptions(base_options=base_options,num_hands=num_hands # 设置检测的手部数量)self.detector = vision.HandLandmarker.create_from_options(options)
参数说明:
-
model_path
: 预训练模型文件路径 -
num_hands
: 最大检测手部数量(1或2)
2. 绘制手部关键点与连接
def _draw_landmarks_on_image(self, rgb_image, detection_result):annotated_image = np.copy(rgb_image)if detection_result.hand_landmarks:for hand_landmarks in detection_result.hand_landmarks:# 转换关键点格式hand_landmarks_proto = landmark_pb2.NormalizedLandmarkList()hand_landmarks_proto.landmark.extend([landmark_pb2.NormalizedLandmark(x=lm.x, y=lm.y, z=lm.z)for lm in hand_landmarks])# 绘制关键点和连接线solutions.drawing_utils.draw_landmarks(image=annotated_image,landmark_list=hand_landmarks_proto,connections=mp.solutions.hands.HAND_CONNECTIONS, # 预定义的手部连接关系landmark_drawing_spec=solutions.drawing_styles.get_default_hand_landmarks_style(),connection_drawing_spec=solutions.drawing_styles.get_default_hand_connections_style())return annotated_image
3. 实时处理流程
import cv2import numpy as npimport mediapipe as mpfrom mediapipe import solutionsfrom mediapipe.framework.formats import landmark_pb2from mediapipe.tasks import pythonfrom mediapipe.tasks.python import visionclass HandPose:def __init__(self, model_path="文件路径/hand_landmarker.task", num_hands=2):"""初始化 Mediapipe HandLandmarker"""base_options = python.BaseOptions(model_asset_path=model_path)options = vision.HandLandmarkerOptions(base_options=base_options,num_hands=num_hands)self.detector = vision.HandLandmarker.create_from_options(options)def _draw_landmarks_on_image(self, rgb_image, detection_result):"""在图像上绘制手部关键点和连接"""annotated_image = np.copy(rgb_image)if detection_result.hand_landmarks:for hand_landmarks in detection_result.hand_landmarks:hand_landmarks_proto = landmark_pb2.NormalizedLandmarkList()hand_landmarks_proto.landmark.extend([landmark_pb2.NormalizedLandmark(x=lm.x, y=lm.y, z=lm.z)for lm in hand_landmarks])solutions.drawing_utils.draw_landmarks(image=annotated_image,landmark_list=hand_landmarks_proto,connections=mp.solutions.hands.HAND_CONNECTIONS,landmark_drawing_spec=solutions.drawing_styles.get_default_hand_landmarks_style(),connection_drawing_spec=solutions.drawing_styles.get_default_hand_connections_style())return annotated_imagedef do(self, frame, device=None):"""处理单帧图像,返回绘制手部关键点后的帧"""if frame is None:return Nonemp_image = mp.Image(image_format=mp.ImageFormat.SRGB,data=cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))detection_result = self.detector.detect(mp_image)annotated = self._draw_landmarks_on_image(mp_image.numpy_view(), detection_result)return cv2.cvtColor(annotated, cv2.COLOR_RGB2BGR)
当然默认版本的绘制效果不如自定义
import cv2
import numpy as np
import mediapipe as mp
from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2
from mediapipe.tasks import python
from mediapipe.tasks.python import visionclass HandPose:def __init__(self, model_path="文件路径/hand_landmarker.task", num_hands=2):"""初始化 Mediapipe HandLandmarker"""base_options = python.BaseOptions(model_asset_path=model_path)options = vision.HandLandmarkerOptions(base_options=base_options,num_hands=num_hands)self.detector = vision.HandLandmarker.create_from_options(options)# 所有关节点统一样式(红色圆点)self.landmark_style = solutions.drawing_utils.DrawingSpec(color=(0, 0, 255), thickness=2, circle_radius=3)# 定义不同手指连线(拇指、食指、中指、无名指、小拇指)self.finger_connections = {"thumb": [(0, 1), (1, 2), (2, 3), (3, 4)],"index": [(0, 5), (5, 6), (6, 7), (7, 8)],"middle": [(0, 9), (9, 10), (10, 11), (11, 12)],"ring": [(0, 13), (13, 14), (14, 15), (15, 16)],"pinky": [(0, 17), (17, 18), (18, 19), (19, 20)],}# 每根手指一种颜色 (BGR)self.finger_colors = {"thumb": (0, 255, 0), # 绿色"index": (255, 0, 0), # 蓝色"middle": (0, 255, 255), # 黄色"ring": (255, 0, 255), # 紫色"pinky": (255, 165, 0), # 橙色}def _draw_landmarks_on_image(self, rgb_image, detection_result):"""在图像上绘制手部关键点和连接"""annotated_image = np.copy(rgb_image)if detection_result.hand_landmarks:for hand_landmarks in detection_result.hand_landmarks:# 画关键点for lm in hand_landmarks:h, w, _ = annotated_image.shapecx, cy = int(lm.x * w), int(lm.y * h)cv2.circle(annotated_image, (cx, cy), self.landmark_style.circle_radius,self.landmark_style.color, -1)# 画不同手指的连线for finger, connections in self.finger_connections.items():color = self.finger_colors[finger]for start_idx, end_idx in connections:h, w, _ = annotated_image.shapex1, y1 = int(hand_landmarks[start_idx].x * w), int(hand_landmarks[start_idx].y * h)x2, y2 = int(hand_landmarks[end_idx].x * w), int(hand_landmarks[end_idx].y * h)cv2.line(annotated_image, (x1, y1), (x2, y2), color, 2)return annotated_imagedef do(self, frame, device=None):"""处理单帧图像,返回绘制手部关键点后的帧"""if frame is None:return Nonemp_image = mp.Image(image_format=mp.ImageFormat.SRGB,data=cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))detection_result = self.detector.detect(mp_image)annotated = self._draw_landmarks_on_image(mp_image.numpy_view(), detection_result)return cv2.cvtColor(annotated, cv2.COLOR_RGB2BGR)
技术特点与优势
实时性能
-
在普通CPU上可达实时处理速度(30+ FPS)
-
优化后的模型适合移动端部署
高精度检测
-
21个关键点提供丰富的手部姿态信息
-
3D坐标支持深度感知应用
多手支持
-
可同时检测和跟踪多只手
-
独立处理每只手的关键点
应用场景
1. 手势控制与交互
python
# 简单手势识别示例 def detect_gesture(hand_landmarks):# 检测拇指和食指是否接触(OK手势)thumb_tip = hand_landmarks[4]index_tip = hand_landmarks[8]distance = np.sqrt((thumb_tip.x - index_tip.x)**2 + (thumb_tip.y - index_tip.y)**2)return distance < 0.05 # 阈值判断
2. 虚拟现实与增强现实
-
手部驱动的3D模型控制
-
AR环境中的自然交互
-
虚拟手部试戴和预览
3. 手语识别与翻译
-
实时手语动作捕捉
-
手语到文本/语音的转换
-
辅助听障人士沟通
4. 医疗康复应用
-
手部运动功能评估
-
康复训练进度监控
-
精细动作能力测试
5. 教育与培训
-
乐器演奏指导
-
手术操作训练
-
手工艺教学
结论
MediaPipe Hand Landmarker为开发者提供了一个强大而易用的手部姿态检测工具。其高精度、实时性和跨平台特性使其成为构建下一代人机交互应用的理想选择。随着技术的不断发展,手部姿态检测将在更多领域发挥重要作用,从日常的智能设备交互到专业的医疗康复应用,都将受益于这一技术的进步。
通过本文介绍的实现方法和应用场景,开发者可以快速上手并开始构建基于手部姿态检测的创新应用,为人机交互带来更加自然和直观的体验。
对 PiscTrace or PiscCode感兴趣?更多精彩内容请移步官网看看~🔗 PiscTrace