BriefGPT.xyz
Jul, 2016
Nesterov的加速梯度和动量作为正则化更新下降的近似
Nesterov's Accelerated Gradient and Momentum as approximations to Regularised Update Descent
HTML
PDF
Aleksandar Botev, Guy Lever, David Barber
TL;DR
我们提出了一种适应梯度下降优化方法中更新方向的统一框架,并通过重构经典动量法和Nesterov加速梯度法来解释后者算法的新直观解释。我们展示了一种新算法——正则化梯度下降——比Nesterov算法或经典动量算法更快地收敛。
Abstract
We present a unifying framework for adapting the update direction in gradient-based
iterative optimization
methods. As natural special cases we re-derive classical
momentum
and
→