多遍随机梯度方法的最优收敛速率

May, 2016

多遍随机梯度方法的最优收敛速率

Optimal Learning for Multi-pass Stochastic Gradient Methods

Junhong Lin, Lorenzo Rosasco

TL;DR本文研究了随机梯度方法在多次迭代和小批量训练时的学习特性，并且调节了正则化特性的参数，确认了通过控制迭代次数可以达到最优的有限样本界，同时，合适的步长可以让较大的批量予以考虑，我们使用统一方法，将批量和随机梯度方法作为特例，得到了批量梯度方法的最优收敛结果(即使在不可达的情况下)。

Abstract

We analyze the learning properties of the stochastic gradient method when multiple passes over the data and mini-batches are allowed. In particular, we consider the square loss and show that for a universal