当前位置：首页 > news >正文

（ML-Agents）是一个开源项目，它使游戏和模拟能够作为使用深度强化学习和模仿学习训练智能代理的环境

news 来源：原创 2025/6/13 9:13:10

一、软件介绍

文末提供程序和源码下载

（ML-Agents）是一个开源项目，使游戏和模拟能够用作训练智能代理的环境。我们提供最先进算法的实现（基于 PyTorch），使游戏开发人员和业余爱好者能够轻松训练 2D、3D 和 VR/AR 游戏的智能代理。研究人员还可以使用提供的易于使用的 Python API 来使用强化学习、模仿学习、神经进化或任何其他方法训练代理。这些经过训练的代理可用于多种用途，包括控制 NPC 行为（在各种设置中，例如多代理和对抗）、游戏构建的自动化测试以及在发布前评估不同的游戏设计决策。ML-Agents 工具包对游戏开发人员和 AI 研究人员都是互惠互利的，因为它提供了一个中央平台，可以在 Unity 丰富的环境中评估 AI 的进步，然后提供给更广泛的研究和游戏开发人员社区。

二、Features 特征

17+ example Unity environments
17+ 示例 Unity 环境
Support for multiple environment configurations and training scenarios
支持多种环境配置和培训场景
Flexible Unity SDK that can be integrated into your game or custom Unity scene
灵活的 Unity SDK，可以集成到您的游戏或自定义 Unity 场景中
Support for training single-agent, multi-agent cooperative, and multi-agent competitive scenarios via several Deep Reinforcement Learning algorithms (PPO, SAC, MA-POCA, self-play).
支持通过多种深度强化学习算法（PPO、SAC、MA-POCA、自播放）训练单智能体、多智能体协作和多智能体竞争场景。
Support for learning from demonstrations through two Imitation Learning algorithms (BC and GAIL).
支持通过两种模仿学习算法（BC 和 GAIL）从演示中学习。
Quickly and easily add your own custom training algorithm and/or components.
快速轻松地添加您自己的自定义训练算法和/或组件。
Easily definable Curriculum Learning scenarios for complex tasks
易于定义的课程学习场景，适用于复杂任务
Train robust agents using environment randomization
使用环境随机化训练稳健的代理
Flexible agent control with On Demand Decision Making
通过按需决策实现灵活的座席控制
Train using multiple concurrent Unity environment instances
使用多个并发 Unity 环境实例进行训练
Utilizes the Sentis to provide native cross-platform support
利用 Sentis 提供本机跨平台支持
Unity environment control from Python
来自 Python 的 Unity 环境控制
Wrap Unity learning environments as a gym environment
将 Unity 学习环境打包为健身房环境
Wrap Unity learning environments as a PettingZoo environment
将 Unity 学习环境打包为 PettingZoo 环境

三、其他资源

如果您是一名研究人员，并且对讨论 Unity 作为 AI 平台感兴趣，请参阅我们关于 Unity 和 ML-Agents Toolkit 的参考论文的预印本。

If you use Unity or the ML-Agents Toolkit to conduct research, we ask that you cite the following paper as a reference:
如果您使用 Unity 或 ML-Agents Toolkit 进行研究，我们要求您引用以下论文作为参考：

<span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><span style="color:#1f2328"><span style="color:var(--fgColor-default, var(--color-fg-default))"><span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><code>@article{juliani2020,title={Unity: A general platform for intelligent agents},author={Juliani, Arthur and Berges, Vincent-Pierre and Teng, Ervin and Cohen, Andrew and Harper, Jonathan and Elion, Chris and Goy, Chris and Gao, Yuan and Henry, Hunter and Mattar, Marwan and Lange, Danny},journal={arXiv preprint arXiv:1809.02627},url={https://arxiv.org/pdf/1809.02627.pdf},year={2020}
}
</code></span></span></span></span>

Additionally, if you use the MA-POCA trainer in your research, we ask that you cite the following paper as a reference:
此外，如果您在研究中使用 MA-POCA trainer，我们要求您引用以下论文作为参考：

<span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><span style="color:#1f2328"><span style="color:var(--fgColor-default, var(--color-fg-default))"><span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><code>@article{cohen2022,title={On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning},author={Cohen, Andrew and Teng, Ervin and Berges, Vincent-Pierre and Dong, Ruo-Ping and Henry, Hunter and Mattar, Marwan and Zook, Alexander and Ganguly, Sujoy},journal={RL in Games Workshop AAAI 2022},url={http://aaai-rlg.mlanctot.info/papers/AAAI22-RLG_paper_32.pdf},year={2022}
}
</code></span></span></span></span>