BriefGPT.xyz
Jun, 2018
有限时间内基于线性函数逼近的时序差分学习分析
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
HTML
PDF
Jalaj Bhandari, Daniel Russo, Raghav Singal
TL;DR
本文提供了关于具有线性函数逼近的时间差异学习的简单而明确的有限时间分析,研究它在强化学习中的适用性,分析结果适用于TD(λ)学习和应用于高维度最佳停止问题的Q-learning。
Abstract
temporal difference learning
(TD) is a simple iterative algorithm used to estimate the value function corresponding to a given policy in a
markov decision process
. Although TD is one of the most widely used algor
→