BriefGPT.xyz
Jun, 2019
训练超参数化深度神经网络的改进分析
An Improved Analysis of Training Over-parameterized Deep Neural Networks
HTML
PDF
Difan Zou, Quanquan Gu
TL;DR
本文提供了一种改进的分析方法来探究(随机)梯度下降训练深度神经网络的全局收敛,该方法比之前的研究工作以更加温和的过度参数化条件确定了问题相关参数的大小,包括更紧密的梯度下限和更清晰的算法轨迹路径描述。
Abstract
A recent line of research has shown that
gradient-based algorithms
with random initialization can converge to the global minima of the training loss for over-parameterized (i.e., sufficiently wide)
deep neural networks<
→