BriefGPT.xyz
Aug, 2013
一般强化学习的样本复杂度
The Sample-Complexity of General Reinforcement Learning
HTML
PDF
Tor Lattimore, Marcus Hutter, Peter Sunehag
TL;DR
本文提出了一种新的泛化强化学习算法,适用于真实环境属于N个任意模型的情况下。该算法被证明在除O(N log^2 N)步骤之外的大部分情况下都是最优的,并考虑了无限的情况。同时研究表明,紧致性是决定存在统一样本复杂度界限的关键标准,并为有限情况给出匹配的下界。
Abstract
We present a new
algorithm
for general
reinforcement learning
where the true environment is known to belong to a finite class of N arbitrary
mode
→