BriefGPT.xyz
Feb, 2021
基于启发式策略评估的自举式Q评估优化
Bootstrapping Statistical Inference for Off-Policy Evaluation
HTML
PDF
Botao Hao, Xiang, Ji, Yaqi Duan, Hao Lu...
TL;DR
本文探讨了自举法在强化学习中的应用和如何提高自举法的计算效率,使用 FQE 方法进行策略评估,并用数值实验评估自举法在强化学习中的潜力。
Abstract
bootstrapping
provides a flexible and effective approach for assessing the quality of batch
reinforcement learning
, yet its theoretical property is less understood. In this paper, we study the use of
→