BriefGPT.xyz
Oct, 2021
面向噪声自适应、问题自适应(加速)随机梯度下降
Towards Noise-adaptive, Problem-adaptive Stochastic Gradient Descent
HTML
PDF
Sharan Vaswani, Benjamin Dubois-Taine, Reza Babanezhad
TL;DR
通过利用指数步长和随机线性搜索等技术,使得随机梯度下降算法适应不同噪声水平和问题相关的常数,可以在强凸函数的条件下,取得与理论最优相近的收敛速度,同时能够有效地处理噪声和数据不凸的情况。
Abstract
We design step-size schemes that make
stochastic gradient descent
(SGD) adaptive to (i) the noise $\sigma^2$ in the stochastic gradients and (ii) problem-dependent constants. When minimizing smooth,
strongly-convex func
→