当前位置：首页 > news >正文

机器人模仿学习调研

news 2025/9/26 13:26:52

一、基于学习方法在移动操作中的应用现状

基于学习的方法已成为移动操作任务的核心解决方案，显著降低了传统工程设计的负担。其核心优势在于通过数据驱动的方式适应复杂环境，避免了手动设计规则的局限性。

二、传统方法应对高维空间探索的技术路径

为解决移动操作中高维状态与动作空间的探索难题，先前研究主要采用三类策略：

2.1 预定义技能原语：

将复杂任务拆解为基础动作单元（如抓取、移动），通过组合实现整体控制【1】【2】【3】。

例：预先设计 “物体拾取”“路径规划” 等原子技能，通过有限状态机调度执行。

2.2 分解动作空间的强化学习（RL）

将动作空间拆分为子任务（如移动底盘、控制机械臂），分别优化后融合【4】【5】【6】【7】【8】。

例：用 RL 独立训练机器人的移动与抓取策略，再通过层次化架构整合。

2.3 全身控制目标优化

直接以机器人全身状态（如关节角度、末端位置）为目标，通过优化算法求解动作序列【9】【10】【11】。

例：基于模型预测控制（MPC）优化全身运动轨迹，确保动态平衡。

三、模仿学习在移动操作中的突破性进展

与传统方法（依赖动作原语、状态估计或物体先验）不同，模仿学习（Imitation Learning）实现了端到端的直接映射：

核心优势：通过原始 RGB 图像直接映射至全身动作，无需人工设计中间表征，且大规模真实数据训练在室内场景中效果显著【12】【13】【14】【15】。
技术路径：利用专家演示数据（如人类操作轨迹），通过监督学习或对抗训练让智能体模仿最优策略。

四、专家演示数据的收集方式与技术演进

先前研究采用多种手段获取演示数据：

4.1 交互界面驱动：

VR 界面（如 Oculus Rift）【16】：通过虚拟场景操控映射真实机器人动作；
智能手机界面【19】：简化操作门槛，适用于非专业用户。

4.2 物理交互设备：

触觉教学【17】：直接手动引导机器人完成动作，记录轨迹；
动作捕捉系统【20】：通过光学标记点追踪人体运动（如 Vicon 系统）。

4.3 智能体自生成数据：

训练的 RL 策略【18】：用强化学习预训练策略生成 “专家级” 轨迹。

五、人形机器人远程操作的技术探索

针对人形机器人的远程操控，研究聚焦于多模态交互技术：

5.1 动作映射设备：

人类动作捕捉服【22】：通过传感器将人体关节运动同步至机器人；
外骨骼【23】：穿戴式设备采集人体发力与姿态数据（如肌电信号、关节角度）。

5.2 反馈系统：

VR 头戴设备【24】：提供机器人视角的视觉反馈；
触觉反馈设备【25】：通过振动、压力等传递环境接触信息（如地面摩擦力、物体抓取力度）。

六、现有技术的局限性：低成本全身演示方案的缺失

Purushottam 等人曾开发基于力板的外骨骼套装，实现轮式人形机器人的全身远程操作，但存在显著缺陷：

成本高昂：力板与专业外骨骼设备价格昂贵，难以普及；
应用场景受限：仅适用于轮式机器人，无法满足双手移动操作（如抓取、搬运）的全身演示需求。

因此，当前领域缺乏低成本、通用的双手移动操作专家演示收集方案，制约了大规模数据驱动方法的发展。

参考文献

【1】Charles Sun, Jedrzej Orbik, Coline Manon Devin, Brian H Yang, Abhishek Gupta, Glen
Berseth, and Sergey Levine. Fully autonomous real-world reinforcement learning with applications to mobile manipulation. In Conference on Robot Learning, 2021. 3

【2】BohanWu, Roberto Martin-Martin, and Li Fei-Fei. M-ember: Tackling long-horizon mobile manipulation via factorized domain transfer. ICRA, 2023. 3

【3】Jimmy Wu, Rika Antonova, Adam Kan, Marion Lepert, Andy Zeng, Shuran Song, Jeannette Bohg, Szymon Rusinkiewicz, and Thomas Funkhouser. Tidybot: Personalized robot assistance with large language models. IROS, 2023. 3

【4】Jiayuan Gu, Devendra Singh Chaplot, Hao Su, and Jitendra Malik. Multi-skill mobile manipulation for object rearrangement. ICLR, 2023. 3

【5】Snehal Jauhri, Jan Peters, and Georgia Chalvatzaki. Robot learning of mobile manipulation
with reachability behavior priors. IEEE Robotics and Automation Letters, 2022. 3

【6】Yuntao Ma, Farbod Farshidian, Takahiro Miki, Joonho Lee, and Marco Hutter. Combining
learning-based locomotion policy with modelbased manipulation for legged mobile manipulators.
IEEE Robotics and Automation Letters, 2022. 3

【7】Fei Xia, Chengshu Li, Roberto Martín-Martín, Or Litany, Alexander Toshev, and Silvio Savarese. Relmogen: Integrating motion generation in reinforcement learning for mobile manipulation. In 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021.3

【8】Naoki Yokoyama, Alexander William Clegg, Eric Undersander, Sehoon Ha, Dhruv Batra, and Akshara Rai. Adaptive skill coordination for robotic mobile manipulation. arXiv preprint arXiv:2304.00410, 2023. 3

【9】Zipeng Fu, Xuxin Cheng, and Deepak Pathak. Deep whole-body control: learning a unified policy for manipulation and locomotion. In Conference on Robot Learning, 2022. 3

【10】Jiaheng Hu, Peter Stone, and Roberto Martín- Martín. Causal policy gradient for wholebody mobile manipulation. arXiv preprint arXiv:2305.04866, 2023. 3

【11】Ruihan Yang, Yejin Kim, Aniruddha Kembhavi, Xiaolong Wang, and Kiana Ehsani. Harmonic mobile manipulation. arXiv preprint arXiv:2312.06639, 2023. 3

【12】Michael Ahn, Do as i can and not as i say: Grounding language in robotic affordances. In arXiv preprint arXiv:2204.01691, 2022. 3

【13】Rt-1: Robotics transformer for real-world control at scale. In arXiv preprint arXiv:2212.06817, 2022. 1, 3

【14】Nur Muhammad Mahi Shafiullah, Anant Rai, Haritheja Etukuru, Yiqian Liu, Ishan Misra, Soumith Chintala, and Lerrel Pinto. On bringing robots home. arXiv preprint arXiv:2311.16098, 2023. 3

【15】Jiayuan Gu, Devendra Singh Chaplot, Hao Su, and Jitendra Malik. Multi-skill mobile manipulation for object rearrangement. ICLR, 2023. 3

【16】Mingyo Seo, Steve Han, Kyutae Sim, Seung Hyeon Bang, Carlos Gonzalez, Luis Sentis, and Yuke Zhu. Deep imitation learning for humanoid loco-manipulation through human teleoperation. Humanoids, 2023. 3

【17】Taozheng Yang, Ya Jing, Hongtao Wu, Jiafeng Xu, Kuankuan Sima, Guangzeng Chen, Qie Sima, and Tao Kong. Moma-force: Visualforce imitation for real-world mobile manipulation. arXiv preprint arXiv:2308.03624, 2023. 3

【18】Xiaoyu Huang, Dhruv Batra, Akshara Rai, and Andrew Szot. Skill transformer: A monolithic policy for mobile manipulation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023. 3

【19】Josiah Wong, Albert Tung, Andrey Kurenkov, Ajay Mandlekar, Li Fei-Fei, Silvio Savarese, and Roberto Martín-Martín. Error-aware imitation learning from teleoperation data for mobile manipulation. In Conference on Robot Learning, 2022. 3

【20】Miguel Arduengo, Ana Arduengo, Adrià Colomé, Joan Lobo-Prat, and Carme Torras. Human to robot whole-body motion transfer. In 2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids), 2021. 2, 3

【21】Shikhar Bahl, Abhinav Gupta, and Deepak Pathak. Human-to-robot imitation in the wild.
arXiv preprint arXiv:2207.09450, 2022. 3

【22】R Cisneros, M Benallegue, K Kaneko, H Kaminaga, G Caron, A Tanguy, R Singh, L Sun, A Dallard, C Fournier, et al. Team janus humanoid avatar: A cybernetic avatar to embody human telepresence. In Toward Robot Avatars: Perspectives on the ANA Avatar XPRIZE Competition, RSS Workshop, 2022. 3

【23】Hongjie Fang, Hao-Shu Fang, Yiming Wang, Jieji Ren, Jingjing Chen, Ruo Zhang, Weiming Wang, and Cewu Lu. Low-cost exoskeletons for learning whole-arm manipulation in the wild. arXiv preprint arXiv:2309.14975, 2023. 3

【24】Jean Chagas Vaz, Dylan Wallace, and Paul Y Oh. Humanoid loco-manipulation of pushed carts utilizing virtual reality teleoperation. In ASME International Mechanical Engineering Congress and Exposition, 2021. 3

【25】Anais Brygo, Ioannis Sarakoglou, Nadia Garcia-Hernandez, and Nikolaos Tsagarakis. Humanoid robot teleoperation with vibrotactile based balancing feedback. In Haptics: Neuroscience, Devices, Modeling, and Applications: 9th International Conference, EuroHaptics 2014, Versailles, France, June 24-26, 2014, Proceedings, Part II 9, 2014. 3

查看全文

http://www.dtcms.com/a/241351.html