深度学习的鲁棒自适应随机梯度方法

Mar, 2017

深度学习的鲁棒自适应随机梯度方法

A Robust Adaptive Stochastic Gradient Method for Deep Learning

Caglar Gulcehre, Jose Sotelo, Marcin Moczulski, Yoshua Bengio

TL;DR本文提出了一种自适应学习率算法，该算法利用了损失函数的随机曲率信息自动调整学习率，并且提出了一种新的方差缩减技术以加速收敛，在深度神经网络实验中，相比于流行的随机梯度算法获得了更好的性能。

Abstract

stochastic gradient algorithms are the main focus of large-scale optimization problems and led to important successes in the recent advancement of the deep learning algorithms. The convergence of SGD depends on t