$μ^2$-SGD: 双动量机制实现稳定随机优化

Apr, 2023

$μ^2$-SGD: 双动量机制实现稳定随机优化

$μ^2$-SGD: Stable Stochastic Optimization via a Double Momentum Mechanism

Kfir Y. Levy

TL;DR研究通过基于动量的两种近期机制，结合两者来得出新的梯度估计，设计基于SGD的算法和加速版本的算法，并展示这些新方法对学习率选择的鲁棒性以及在无噪音和有噪音情况下具有相同的最佳收敛速度。

Abstract

We consider stochastic convex optimization problems where the objective is an expectation over smooth functions. For this setting we suggest a novel gradient estimate that combines two recent mechanism that are r