用于最优停止域的高效策略搜索

Feb, 2017

Sample Efficient Policy Search for Optimal Stopping Domains

Karan Goel, Christoph Dann, Emma Brunskill

TL;DR本文研究了同时学习和规划的最优停止问题，提出了一种名为GFSE的简单灵活的无模型策略搜索方法，通过利用问题结构重复利用数据来提高采样效率，文中还对该方法在三个不同领域的表现与基于模型和无模型的现有方法进行了比较。

Abstract

Arising naturally in many fields, optimal stopping problems consider the question of deciding when to stop an observation-generating process. We examine the problem of simultaneously learning and planning in such