BriefGPT.xyz
Jul, 2022
一般和式马尔可夫博弈的遗憾最小化和均衡收敛
Regret Minimization and Convergence to Equilibria in General-sum Markov Games
HTML
PDF
Liad Erez, Tal Lancewicki, Uri Sherman, Tomer Koren, Yishay Mansour
TL;DR
简而言之,本文提出了一种针对广义和博弈的、分散、计算高效的算法,其保证所有代理都使用时可以提供次线性遗憾保证,并且不需要代理之间的通信。该算法的主要观察结果是,通过马尔可夫游戏的在线学习基本上可以归结为一种加权遗憾最小化。
Abstract
An abundance of recent impossibility results establish that
regret minimization
in
markov games
with adversarial opponents is both statistically and computationally intractable. Nevertheless, none of these result
→