Game-based decision-making involves reasoning over both world dynamics and
strategic interactions among the agents. Typically, empirical models capturing
these respective aspects are learned and used separately. We investigate the
potential gain from co-learning these elements: a world model for dynamics and
an empirical game for strategic interactions. Empirical games drive world
models toward a broader consideration of possible game dynamics induced by a
diversity of strategy profiles. Conversely, world models guide empirical games
to efficiently discover new strategies through planning. We demonstrate these
benefits first independently, then in combination as realized by a new
algorithm, Dyna-PSRO, that co-learns an empirical game and a world model. When
compared to PSRO -- a baseline empirical-game building algorithm, Dyna-PSRO is
found to compute lower regret solutions on partially observable general-sum
games. In our experiments, Dyna-PSRO also requires substantially fewer
experiences than PSRO, a key algorithmic advantage for settings where
collecting player-game interaction data is a cost-limiting factor.

探索协同学习游戏决策制定中的世界动力学和策略交互两方面，实现了一种新算法 Dyna-PSRO，在部分可观察的一般和博弈中计算出的遗憾解决方案比基线算法 PSRO 要低，所需收集的玩家 - 游戏交互数据也少得多。