BriefGPT.xyz
Jun, 2021
随机动量法快速逃脱鞍点
Escaping Saddle Points Faster with Stochastic Momentum
HTML
PDF
Jun-Kun Wang, Chi-Heng Lin, Jacob Abernethy
TL;DR
本研究探讨了随机动量梯度下降(stochastic momentum)算法在深度神经网络训练中的作用,提出了其改进了随机梯度下降算法以更快地逃离鞍点并找到更快的二阶稳定点的结论。理论分析表明,$eta$应该接近1,这与实验结果一致。
Abstract
stochastic gradient descent
(SGD) with
stochastic momentum
is popular in nonconvex stochastic optimization and particularly for the training of deep neural networks. In standard SGD, parameters are updated by imp
→