BriefGPT.xyz
May, 2019
非线性随机逼近的有限样本分析及其在强化学习中的应用
Finite-Time Analysis of Q-Learning with Linear Function Approximation
HTML
PDF
Zaiwei Chen, Sheng Zhang, Thinh T. Doan, Siva Theja Maguluri, John-Paul Clarke
TL;DR
研究了一种在Markovian噪声下的非线性随机逼近算法,证明了其具有不同学习速率的有限样本收敛界限,并证明了其适用于Q-learning算法。
Abstract
In this paper, we consider the model-free
reinforcement learning
problem and study the popular
q-learning
algorithm with linear function approximation for finding the optimal policy. Despite its popularity, it is
→