自适应批处理大小的自动推理：大批量 SGD

Oct, 2016

自适应批处理大小的自动推理：大批量 SGD

Big Batch SGD: Automated Inference using Adaptive Batch Sizes

Soham De, Abhay Yadav, David Jacobs, Tom Goldstein

TL;DR本文介绍了一种采用自适应“大数据块”随机梯度下降方案的方法，以维持梯度逼近的信噪比的稳定，从而实现自动学习率选择和避免步长衰减，并且不需要目标函数凸性的限制。

Abstract

Classical stochastic gradient methods for optimization rely on noisy gradient approximations that become progressively less accurate as iterates approach a solution. The large noise and small signal in the resulting gradients makes it difficult to use them for adaptive stepsize selecti