BriefGPT.xyz
Jun, 2024
自适应梯度方法在细化平滑度和噪声假设下的收敛分析
Convergence Analysis of Adaptive Gradient Methods under Refined Smoothness and Noise Assumptions
HTML
PDF
Devyani Maladkar, Ruichen Jiang, Aryan Mokhtari
TL;DR
分析了AdaGrad在随机非凸优化中收敛速率,证明了存在优于SGD的收敛速度,并给出了收敛速率的上界和下界。
Abstract
adaptive gradient methods
are arguably the most successful optimization algorithms for neural network training. While it is well-known that
adaptive gradient methods
can achieve better
→