BriefGPT.xyz
May, 2017
具有ReLU激活函数的双层神经网络的收敛性分析
Convergence Analysis of Two-layer Neural Networks with ReLU Activation
HTML
PDF
Yuanzhi Li, Yang Yuan
TL;DR
本文分析了使用随机梯度下降(SGD)训练包含ReLU激活函数的两层前馈神经网络中所谓的“恒等映射”结构和高斯分布输入的情况下SGD收敛的机理,并通过实验证明使用该结构的多层神经网络具有比普通神经网络更好的性能。
Abstract
In recent years,
stochastic gradient descent
(SGD) based techniques has become the standard tools for training
neural networks
. However, formal theoretical understanding of why SGD can train
→