关于带有噪声的动量随机梯度下降法在机器学习中的收敛速率

Feb, 2023

关于带有噪声的动量随机梯度下降法在机器学习中的收敛速率

Convergence rates for momentum stochastic gradient descent with noise of machine learning type

Benjamin Gess, Sebastian Kassing

TL;DR本文研究了非凸优化中动量随机梯度下降(MSGD)算法的连续性版本，并证明了在目标函数满足Lipschitz连续性和Polyak-Lojasiewicz不等式的条件下，MSGD算法的目标函数极限收敛指数级收敛，同时在给定摩擦参数的情况下，MSGD过程几乎必定收敛。

Abstract

We consider the momentum stochastic gradient descent scheme (MSGD) and its continuous-in-time counterpart in the context of non-convex optimization. We show almost sure exponential convergence of the objective fu