BriefGPT.xyz
May, 2023
随机梯度下降中动态稳定的隐式正则化
The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent
HTML
PDF
Lei Wu, Weijie J. Su
TL;DR
本文通过动态稳定性的角度研究了随机梯度下降法(SGD)的隐式正则化,并探讨了稳定的最小值对二层ReLU神经网络和对角线线性网络的广义性能影响,发现SGD的稳定性正则化较于GD更强,LR越大效果越明显,解释了为什么SGD比GD更具普适性。
Abstract
In this paper, we study the implicit
regularization
of
stochastic gradient descent
(SGD) through the lens of {\em
dynamical stability
} (Wu
→