BriefGPT.xyz
Oct, 2021
带有递减步长的SGD最后迭代的过参数化线性回归风险界
Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression
HTML
PDF
Jingfeng Wu, Difan Zou, Vladimir Braverman, Quanquan Gu, Sham M. Kakade
TL;DR
本文针对几何递减步长的随机梯度下降算法在过参数化线性回归问题中的应用,对其迭代步数的风险进行了理论分析,并探讨了不同递减方法对算法优化效果的影响。
Abstract
stochastic gradient descent
(SGD) has been demonstrated to generalize well in many deep learning applications. In practice, one often runs SGD with a
geometrically decaying stepsize
, i.e., a constant initial step
→