Siqi Liu, Luke Marris, Daniel Hennes, Josh Merel, Nicolas Heess...
TL;DR本篇论文提出了一种叫 Neural Population Learning(NeuPL)的算法,该算法实现了在游戏中学习多种不同策略,可以有效解决实际游戏中的两个问题,即预算有限时训练不充分和重复学习基础技能的问题,并通过多种测试验证了该算法的鲁棒性和高效性。
Abstract
Learning in strategy games (e.g. StarCraft, poker) requires the discovery of diverse policies. This is often achieved by iteratively training new policies against existing ones, growing a policy population that i