BriefGPT.xyz
May, 2018
批归一化的指数收敛速率:在非凸优化中实现长度-方向解耦的力量
Towards a Theoretical Understanding of Batch Normalization
HTML
PDF
Jonas Kohler, Hadi Daneshmand, Aurelien Lucchi, Ming Zhou, Klaus Neymeyr...
TL;DR
我们通过对多个机器学习实例进行研究,证明了Batch Normalization在优化任务中的加速效果源于其将参数长度和方向分开进行优化,针对这些机器学习问题,Batch Normalization可以是一种收敛算法。
Abstract
Normalization techniques such as
batch normalization
have been applied very successfully for training deep
neural networks
. Yet, despite its apparent empirical benefits, the reasons behind the success of
→