使用深度学习和树搜索加速思维过程

May, 2017

使用深度学习和树搜索加速思维过程

Thinking Fast and Slow with Deep Learning and Tree Search

Thomas Anthony, Zheng Tian, David Barber

TL;DR本文介绍Expert Iteration (ExIt), 一种将强化学习问题分解为计划和泛化任务的算法，使用深度神经网络实现泛化，使用树搜索实现计划，相比于标准深度强化学习算法，ExIt在训练神经网络玩十六进制棋时表现更好，并最终战胜了公开发布的最新奥运会冠军选手MoHex 1.0。

Abstract

Solving sequential decision making problems, such as text parsing, robotic control, and game playing, requires a combination of planning policies and generalisation of those plans. In this paper, we present Expert Iteration, a novel algorithm which decomposes the problem into separate