多智能体对抗逆强化学习

Jul, 2019

Multi-Agent Adversarial Inverse Reinforcement Learning

Lantao Yu, Jiaming Song, Stefano Ermon

TL;DR本文提出了一种新的多智能体逆强化学习框架（MA-AIRL），有效地解决了高维空间和未知动态的马尔科夫博弈问题，并展示了在策略模仿方面，MA-AIRL显著优于现有方法。

Abstract

reinforcement learning agents are prone to undesired behaviors due to reward mis-specification. Finding a set of reward functions to properly guide agent behaviors is particularly challenging in multi-agent scenarios