通过星型凸路径，SGD在深度学习中收敛到全局最小值

Jan, 2019

通过星型凸路径，SGD在深度学习中收敛到全局最小值

SGD Converges to Global Minimum in Deep Learning via Star-convex Path

Yi Zhou, Junjie Yang, Huishuai Zhang, Yingbin Liang, Vahid Tarokh

TL;DR本研究证明了随机梯度下降法 (SGD)可训练深度神经网络，甚至可以收敛于全局最小值。这一结果得益于多个实验验证了SGD可以遵循恒星凸轨迹和训练损失近似于零值等性质，并以新方式揭示了SGD以确定性方式收敛于全局最小值。

Abstract

stochastic gradient descent (SGD) has been found to be surprisingly effective in training a variety of deep neural networks. However, there is still a lack of understanding on how and why SGD can train these comp