Chiheon Kim, Saehoon Kim, Jongmin Kim, Donghoon Lee, Sungwoong Kim
TL;DR本文提出了一种有效的 LR 调试算法,其中包括自适应的预热和预定义的衰减,通过高斯过程平滑的在线检查方法可以有效地训练具有大批次大小的神经网络。
Abstract
large-batch training has been essential in leveraging large-scale datasets
and models in deep learning. While it is computationally beneficial to use
large batch sizes, it often requires a specially designed learning rate (LR)
schedule to achieve a comparable level of performance as in