BriefGPT.xyz
Jul, 2020
基于深度强化学习和搜索算法的不完全信息博弈组合
Combining Deep Reinforcement Learning and Search for Imperfect-Information Games
HTML
PDF
Noam Brown, Anton Bakhtin, Adam Lerer, Qucheng Gong
TL;DR
本文介绍了ReBeL,它是一种通用的强化学习和搜索框架,并在任何两人零和博弈中证明收敛于纳什平衡。同时,使用比任何先前的扑克AI更少的领域知识,ReBeL在无限制德州扑克中实现了超人类性能。
Abstract
The combination of deep
reinforcement learning
and
search
at both training and test time is a powerful paradigm that has led to a number of a successes in single-agent settings and perfect-information games, best
→