BriefGPT.xyz
Mar, 2017
如何高效地逃离鞍点
How to Escape Saddle Points Efficiently
HTML
PDF
Chi Jin, Rong Ge, Praneeth Netrapalli, Sham M. Kakade, Michael I. Jordan
TL;DR
本文研究表明惯性梯度下降法可以在较短的迭代次数内收敛于二阶稳定点,收敛速率与梯度下降到一阶稳定点的收敛速率匹配,当所有鞍点都是非退化的时,所有的二阶稳定点都是局部最小值,该结果表明惯性梯度下降法几乎可以在无成本的情况下脱离鞍点,并可直接应用于许多机器学习应用中,包括深度学习。
Abstract
This paper shows that a perturbed form of
gradient descent
converges to a
second-order stationary point
in a number iterations which depends only poly-logarithmically on dimension (i.e., it is almost "dimension-f
→