BriefGPT.xyz
Oct, 2023
随机梯度下降的噪声几何:定量和分析性特征化
The Noise Geometry of Stochastic Gradient Descent: A Quantitative and Analytical Characterization
HTML
PDF
Mingze Wang, Lei Wu
TL;DR
本文对超参数化线性模型和两层神经网络的噪声几何进行全面的理论研究,揭示了随机梯度下降在逃离尖锐极小值时存在沿平坦方向的显著分量。
Abstract
Empirical studies have demonstrated that the noise in
stochastic gradient descent
(SGD) aligns favorably with the
local geometry
of loss landscape. However, theoretical and quantitative explanations for this phen
→