BriefGPT.xyz
Nov, 2018
梯度下降找到深度神经网络的全局最小值
Gradient Descent Finds Global Minima of Deep Neural Networks
HTML
PDF
Simon S. Du, Jason D. Lee, Haochuan Li, Liwei Wang, Xiyu Zhai
TL;DR
通过分析神经网络架构的格拉姆矩阵的结构,证明了梯度下降法在针对深度超参数神经网络ResNet的多项式时间内实现零训练损失,并且进一步将该分析扩展到了深度残差卷积神经网络并获得了类似的收敛结果。
Abstract
gradient descent
finds a global minimum in training deep
neural networks
despite the objective function being non-
convex
. The current pape
→