BriefGPT.xyz
Jun, 2024
快速学习游戏的最后迭代收敛需要健忘算法
Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms
HTML
PDF
Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee...
TL;DR
通过在线学习的自我对弈是解决大规模两人零和游戏的主要方法之一,尤其流行的算法包括乐观的乘积权重更新(OMWU)和乐观的梯度下降-梯度上升(OGDA),本文证明了OMWU存在潜在的较慢的最后迭代收敛问题。
Abstract
self-play
via
online learning
is one of the premier ways to solve large-scale two-player zero-sum games, both in theory and practice. Particularly popular algorithms include
→