批量归一化如何帮助优化？

May, 2018

How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift)

Shibani Santurkar, Dimitris Tsipras, Andrew Ilyas, Aleksander Madry

TL;DR本文探讨了批量归一化（BatchNorm）对深度神经网络（DNN）训练的影响及原因，发现BatchNorm的成功并不在于控制层输入分布的稳定性，而是在于它让优化的过程变得更加平滑，从而使梯度更加稳定和可预测，加快了训练速度。

Abstract

batch normalization (BatchNorm) is a widely adopted technique that enables faster and more stable training of deep neural networks (DNNs). Despite its pervasiveness, the exact reasons for BatchNorm's effectivenes