BriefGPT.xyz
May, 2022
深度学习中稳定性边缘处的梯度下降理解
Understanding Gradient Descent on Edge of Stability in Deep Learning
HTML
PDF
Sanjeev Arora, Zhiyuan Li, Abhishek Panigrahi
TL;DR
研究了神经网络训练中的难点问题Edge of Stability,发现了一种新的内隐正则化机制,通过对最小化损失面的低维流动,提出对比以往对无穷小更新或梯度噪声的依赖。
Abstract
deep learning
experiments in Cohen et al. (2021) using deterministic
gradient descent
(GD) revealed an {\em
edge of stability
(EoS)} phase
→