BriefGPT.xyz
Jul, 2024
建立PL不等式下自适应梯度方法线性收敛性的方法学
A Methodology Establishing Linear Convergence of Adaptive Gradient Methods under PL Inequality
HTML
PDF
Kushal Chakrabarti, Mayank Baranwal
TL;DR
该论文通过证明当损失函数平滑并满足PL不等式时,自适应梯度方法AdaGrad和Adam可以实现线性收敛。该理论框架采用简单而统一的方法,适用于批量和随机梯度,并可以潜在地用于分析其他Adam变种的线性收敛。
Abstract
adaptive gradient-descent optimizers
are the standard choice for training
neural network models
. Despite their faster
convergence
than gra
→