BriefGPT.xyz
May, 2019
随机方差减小策略梯度的收敛性改进分析
An Improved Convergence Analysis of Stochastic Variance-Reduced Policy Gradient
HTML
PDF
Pan Xu, Felicia Gao, Quanquan Gu
TL;DR
研究改进了SVRPG方法的收敛性和采样复杂度问题,并通过理论分析和实验验证了重要性采样权重和批量大小参数的影响
Abstract
We revisit the
stochastic variance-reduced policy gradient
(SVRPG) method proposed by Papini et al. (2018) for
reinforcement learning
. We provide an improved convergence analysis of SVRPG and show that it can fin
→