We study computationally efficient methods for finding equilibria in n-player general-sum games, specifically ones that afford complex visuomotor skills. We show how existing methods would struggle in this setting, either computationally or in theory. We then introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated Equilibrium (CCE) of the game. We show empirical convergence in a suite of OpenSpiel games, validated rigorously by exact game solvers. We then deploy NeuPL-JPSRO to complex domains, where our approach enables adaptive coordination in a MuJoCo control domain and skill transfer in capture-the-flag. Our work shows that equilibrium convergent population learning can be implemented at scale and in generality, paving the way towards solving real-world games between heterogeneous players with mixed motives.

我们研究了在n个玩家一般和游戏中寻找均衡的计算方法，特别是适用于复杂的视觉运动技能。我们展示了现有方法在此场景下要么计算上困难，要么在理论上受限。接着，我们介绍了NeuPL-JPSRO算法，一种神经人口学习算法，通过技能的迁移学习，并最终收敛于游戏的粗糙相关均衡(CCE)。我们在一系列OpenSpiel游戏中展示了实证的收敛性，并经由精确的游戏求解器严格验证。然后，我们将NeuPL-JPSRO应用于复杂领域，在MuJoCo控制领域实现了自适应协调和技能迁移。我们的工作表明，收敛于均衡的人口学习可以在规模上和广泛性上实施，为解决异质玩家和混合动机的现实世界游戏铺平了道路。

超越对称零和游戏的神经人口学习