使用软Q学习平衡双人随机博弈

Feb, 2018

Balancing Two-Player Stochastic Games with Soft Q-Learning

Jordi Grau-Moya, Felix Leibfried, Haitham Bou-Ammar

TL;DR本文将软Q-学习技术应用于随机博弈中的多智能体系统，实现可调的智能体策略，通过理论和实验贡献，证明了软Q-学习可以在各种不同类型的博弈中实现优异表现。

Abstract

Within the context of video games the notion of perfectly rational agents can be undesirable as it leads to uninteresting situations, where humans face tough adversarial decision makers. Current frameworks for stochasti