BriefGPT.xyz
Jun, 2019
少量梯度评估的政策评估SVRG
SVRG for Policy Evaluation with Fewer Gradient Evaluations
HTML
PDF
Zilun Peng, Ahmed Touati, Pascal Vincent, Doina Precup
TL;DR
本文提出了 Stochastic Variance-Reduced Gradient 方法的两种变体应用于 Policy Evaluation,可以显著减少梯度计算次数,同时保持线性收敛速度,理论分析表明这些方法不需要在每次迭代中使用整个数据集,仅需用于线性函数逼近问题,实验结果展示了这种方法带来的大量计算节省。
Abstract
stochastic variance-reduced gradient
(SVRG) is an optimization method originally designed for tackling machine learning problems with a finite sum structure. SVRG was later shown to work for
policy evaluation
, a
→