BriefGPT.xyz
Mar, 2017
深度学习的鲁棒自适应随机梯度方法
A Robust Adaptive Stochastic Gradient Method for Deep Learning
HTML
PDF
Caglar Gulcehre, Jose Sotelo, Marcin Moczulski, Yoshua Bengio
TL;DR
本文提出了一种自适应学习率算法,该算法利用了损失函数的随机曲率信息自动调整学习率,并且提出了一种新的方差缩减技术以加速收敛,在深度神经网络实验中,相比于流行的随机梯度算法获得了更好的性能。
Abstract
stochastic gradient algorithms
are the main focus of large-scale optimization problems and led to important successes in the recent advancement of the deep learning algorithms. The
convergence
of SGD depends on t
→