BriefGPT.xyz
Apr, 2023
$μ^2$-SGD: 双动量机制实现稳定随机优化
$μ^2$-SGD: Stable Stochastic Optimization via a Double Momentum Mechanism
HTML
PDF
Kfir Y. Levy
TL;DR
研究通过基于动量的两种近期机制,结合两者来得出新的梯度估计,设计基于SGD的算法和加速版本的算法,并展示这些新方法对学习率选择的鲁棒性以及在无噪音和有噪音情况下具有相同的最佳收敛速度。
Abstract
We consider
stochastic convex optimization
problems where the objective is an expectation over smooth functions. For this setting we suggest a novel
gradient estimate
that combines two recent mechanism that are r
→