PPO在合作多智能体游戏中令人惊讶的有效性

Mar, 2021

PPO在合作多智能体游戏中令人惊讶的有效性

The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games

Chao Yu, Akash Velu, Eugene Vinitsky, Yu Wang, Alexandre Bayen...

TL;DR本研究通过四个流行的多智能体测试环境，证明了基于 PPO 的多智能体算法表现出令人惊讶的性能，并降低了样本复杂度，显示出它可以成为协同多智能体强化学习中的强基线方法。

Abstract

proximal policy optimization (ppo) is a popular on-policy reinforcement learning algorithm but is significantly less utilized than off-policy learning algorithms in multi-agent problems. In this work, we investig