使用相关均衡元解算器进行非零和多智能体训练

Jun, 2021

使用相关均衡元解算器进行非零和多智能体训练

Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

Luke Marris, Paul Muller, Marc Lanctot, Karl Tuyls, Thore Grapael

TL;DR 提出了用于训练n人，广义和形博弈的Joint Policy-Space Response Oracles (JPSRO)算法，并建议一种有前途的元解算程序 -- 相关均衡(correlated equilibria)，并提出了最大基尼相关均衡(MGCE)的新解决方案概念。通过使用CE元解决程序对JPSRO进行多次实验，证明了在n人，广义和游戏中的收敛性。

Abstract

Two-player, constant-sum games are well studied in the literature, but there has been limited progress outside of this setting. We propose Joint Policy-Space Response Oracles (JPSRO), an algorithm for training ag