BriefGPT.xyz
Nov, 2023
信号处理 meets SGD:从动量到滤波
Signal Processing Meets SGD: From Momentum to Filter
HTML
PDF
Zhipeng Yao, Guisong Chang, Jiaqi Zhang, Qi Zhang, Yu Zhang...
TL;DR
本文介绍了一种基于降低历史梯度方差的新型优化方法,通过引入自适应权重来增强SGD的一阶时刻估计,在深度学习模型训练过程中动态改变权重以适应梯度方差的变化,实验结果表明该方法能够达到与现有优化方法相媲美的性能。
Abstract
In the field of
deep learning
,
stochastic gradient descent
(SGD) and its momentum-based variants are the predominant choices for optimization algorithms. Despite all that, these
→