BriefGPT.xyz
Mar, 2024
关于随机梯度方法的最终迭代收敛性
On the Last-Iterate Convergence of Shuffling Gradient Methods
HTML
PDF
Zijian Liu, Zhengyuan Zhou
TL;DR
用“随机梯度下降”(SGD)而无需替换的“洗牌梯度方法”,基于曲率刻画关于目标值的收敛速度,证明其对于目标值的最优性。
Abstract
shuffling gradient methods
, which are also known as
stochastic gradient descent
(SGD) without replacement, are widely implemented in practice, particularly including three popular algorithms: Random Reshuffle (RR
→