迭代平均作为随机梯度下降的正则化

Feb, 2018

迭代平均作为随机梯度下降的正则化

Iterate averaging as regularization for stochastic gradient descent

Gergely Neu, Lorenzo Rosasco

TL;DR该论文提出了一种变种的 Polyak-Ruppert 平均方案，通过几何衰减的加权平均来在随机梯度方法中起到正则化的作用，其在线性最小二乘回归中具有岭回归的等价性，并提出与常规随机梯度方法相匹配的有限样本界。

Abstract

We propose and analyze a variant of the classic polyak-ruppert averaging scheme, broadly used in stochastic gradient methods. Rather than a uniform average of the iterates, we consider a