BriefGPT.xyz
Feb, 2012
学习即规划:通过蒙特卡罗树搜索实现接近Bayes最优强化学习
Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search
HTML
PDF
John Asmuth, Michael L. Littman
TL;DR
使用前向搜索稀疏采样算法(FSSS)可以实现接近 Bayes 最优行为,从而使用 Monte-Carlo 树搜索算法有效地处理状态空间极大或无限大的马尔可夫决策过程(MDPs)。
Abstract
bayes-optimal behavior
, while well-defined, is often difficult to achieve. Recent advances in the use of
monte-carlo tree search
(MCTS) have shown that it is possible to act near-optimally in
→