BriefGPT.xyz
Jun, 2020
形状对噪声协方差隐式偏差的影响
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
HTML
PDF
Jeff Z. HaoChen, Colin Wei, Jason D. Lee, Tengyu Ma
TL;DR
本文中,我们理论上证明了随机梯度下降法(SGD)中参数相关噪声(由小批量或标签扰动引起)比高斯噪声更加有效,并且具有对训练过度参数化模型的重要隐式正则化效应。
Abstract
The noise in
stochastic gradient descent
(SGD) provides a crucial
implicit regularization
effect for training
overparameterized models
. Pr
→