BriefGPT.xyz
Jun, 2022
理解归一化层的泛化增益:锐化减少
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
HTML
PDF
Kaifeng Lyu, Zhiyuan Li, Sanjeev Arora
TL;DR
通过数学分析和实验证明,在深度网络中引入标准化层(例如批量标准化,层标准化)有利于优化并促进泛化,同时对于包含标准化的一类神经网络,伴随权值衰减的标准化可以鼓励梯度下降到达稳定边缘,并且对于这种情况,可以确定梯度下降的流动轨迹。
Abstract
normalization layers
(e.g., Batch Normalization, Layer Normalization) were introduced to help with
optimization difficulties
in very deep nets, but they clearly also help
→