ADASECANT：用于随机梯度的鲁棒自适应割线法

Dec, 2014

ADASECANT：用于随机梯度的鲁棒自适应割线法

ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient

Caglar Gulcehre, Yoshua Bengio

TL;DR本文介绍一种新的自适应学习率算法，该算法利用曲率信息自动调整学习率，并提出一种新的方差缩减技术来加速收敛。在深度神经网络的初步实验中，与常见的随机梯度算法相比获得了更好的性能。

Abstract

stochastic gradient algorithms have been the main focus of large-scale learning problems and they led to important successes in deep learning. Convergence of SGD depends on how carefully learning rate of the algorithm is tuned and the noise in stochastic estimates of the gradient. In t