基于深度强化学习和搜索算法的不完全信息博弈组合

Jul, 2020

基于深度强化学习和搜索算法的不完全信息博弈组合

Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

Noam Brown, Anton Bakhtin, Adam Lerer, Qucheng Gong

TL;DR本文介绍了ReBeL，它是一种通用的强化学习和搜索框架，并在任何两人零和博弈中证明收敛于纳什平衡。同时，使用比任何先前的扑克AI更少的领域知识，ReBeL在无限制德州扑克中实现了超人类性能。

Abstract

The combination of deep reinforcement learning and search at both training and test time is a powerful paradigm that has led to a number of a successes in single-agent settings and perfect-information games, best