BriefGPT.xyz
Jun, 2023
基于时代的随机梯度下降中的相关噪声: 对权重方差的影响
Correlated Noise in Epoch-Based Stochastic Gradient Descent: Implications for Weight Variances
HTML
PDF
Marcel Kühn, Bernd Rosenow
TL;DR
本文研究了离散时间下具有动量的SGD的时域白噪声的自相关,并研究了epoch-based噪声相关性对于SGD的影响,结果表明对于大于超参数相关值的曲率方向,可以恢复无关噪声的结果,但对于相对平坦的方向,权重方差显著降低。
Abstract
stochastic gradient descent
(SGD) has become a cornerstone of
neural network optimization
, yet the noise introduced by SGD is often assumed to be uncorrelated over time, despite the ubiquity of
→