BriefGPT.xyz
Feb, 2021
针对两人零和线性混合马尔可夫游戏的近乎最优算法
Almost Optimal Algorithms for Two-player Markov Games with Linear Function Approximation
HTML
PDF
Zixiang Chen, Dongruo Zhou, Quanquan Gu
TL;DR
文章介绍了一种基于乐观不确定性的算法Nash-UCRL,在找到粗略相关均衡的情况下,可以有效地找到两个玩家的纳什均衡,并证明了其上界和下界的一致性,提出了一种解决有限状态下博弈问题的方法。
Abstract
We study
reinforcement learning
for two-player zero-sum
markov games
with simultaneous moves in the finite-horizon setting, where the transition kernel of the underlying
→