BriefGPT.xyz
Mar, 2022
COLA: 具有对手感知的一致学习
COLA: Consistent Learning with Opponent-Learning Awareness
HTML
PDF
Timon Willi, Johannes Treutlein, Alistair Letcher, Jakob Foerster
TL;DR
通过在LOLA算法中引入一种方法称为Consistent LOLA,其中学习更新功能在彼此影响时保持一致,作者在广义和游戏模型中进行了一系列实验,发现这种方法比HOLA和LOLA更容易收敛,并能够找到更加符合社会期望的解决方案。
Abstract
Learning in
general-sum games
can be unstable and often leads to socially undesirable, Pareto-dominated outcomes. To mitigate this,
learning with opponent-learning awareness
(
→