TL;DR本文阐述了权重初始化对神经网络收敛的关键性,通过研究非线性激活函数的影响,提出了一种通用的权重初始化策略,并解释了为什么 Xavier 初始化在 Rectified Linear Unit 激活函数下效果不佳。
Abstract
A proper initialization of the weights in a neural network is critical to its
convergence. Current insights into weight initialization come primarily from
linear activation functions. In this paper, I develop a t