理解归一化层的泛化增益：锐化减少

Jun, 2022

理解归一化层的泛化增益：锐化减少

Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction

Kaifeng Lyu, Zhiyuan Li, Sanjeev Arora

TL;DR通过数学分析和实验证明，在深度网络中引入标准化层（例如批量标准化，层标准化）有利于优化并促进泛化，同时对于包含标准化的一类神经网络，伴随权值衰减的标准化可以鼓励梯度下降到达稳定边缘，并且对于这种情况，可以确定梯度下降的流动轨迹。

Abstract

normalization layers (e.g., Batch Normalization, Layer Normalization) were introduced to help with optimization difficulties in very deep nets, but they clearly also help →