BriefGPT.xyz
Feb, 2017
用于最优停止域的高效策略搜索
Sample Efficient Policy Search for Optimal Stopping Domains
HTML
PDF
Karan Goel, Christoph Dann, Emma Brunskill
TL;DR
本文研究了同时学习和规划的最优停止问题,提出了一种名为GFSE的简单灵活的无模型策略搜索方法,通过利用问题结构重复利用数据来提高采样效率,文中还对该方法在三个不同领域的表现与基于模型和无模型的现有方法进行了比较。
Abstract
Arising naturally in many fields,
optimal stopping problems
consider the question of deciding when to stop an observation-generating process. We examine the problem of simultaneously
learning and planning
in such
→