一种高效的动态采样策略用于蒙特卡洛树搜索

Apr, 2022

一种高效的动态采样策略用于蒙特卡洛树搜索

An Efficient Dynamic Sampling Policy For Monte Carlo Tree Search

Gongbo Zhang, Yijie Peng, Yilong Xu

TL;DR本文探讨在有限时间马尔可夫决策过程的框架下，基于树形搜索策略的蒙特卡罗树搜索(MCTS)。提出了一种动态抽样树策略，有效地分配有限的计算预算，以最大化选择最佳根节点动作的正确性概率。实验结果表明，所提出的树策略比其他竞争方法更有效。

Abstract

We consider the popular tree-based search strategy within the framework of reinforcement learning, the monte carlo tree search (MCTS), in the context of finite-horizon markov decision process. We propose a