BriefGPT.xyz
Aug, 2016
强化学习中遗憾下界的研究
On Lower Bounds for Regret in Reinforcement Learning
HTML
PDF
Ian Osband, Benjamin Van Roy
TL;DR
本文澄清了强化学习的遗憾下限,提出了一个对于REGAL论文中的定理6的推测,并提出了一个比Bartlett和Tewari 2009所提出的更严格的下限。
Abstract
This is a brief technical note to clarify the state of lower bounds on
regret
for
reinforcement learning
. In particular, this paper: - Reproduces a
→