BriefGPT.xyz
Feb, 2019
迈向适度的过度参数化: 为训练浅层神经网络提供全局收敛保证
Towards moderate overparameterization: global convergence guarantees for training shallow neural networks
HTML
PDF
Samet Oymak, Mahdi Soltanolkotabi
TL;DR
本文探讨了神经网络的过度参数化现象对于梯度下降收敛至全域最优解所需的程度及类型,并结合实验结果以浅层神经网络和平滑激活函数为例,证明了只需参数数量高于数据集大小的平方根时,梯度下降随机初始化即可收敛至全域最优解。
Abstract
Many modern
neural network
architectures are trained in an overparameterized regime where the parameters of the model exceed the size of the training dataset. Sufficiently overparameterized
neural network
archite
→