BriefGPT.xyz
Mar, 2020
慢而稳定的梯度也能取得胜利
Slow and Stale Gradients Can Win the Race
HTML
PDF
Sanghamitra Dutta, Jianyu Wang, Gauri Joshi
TL;DR
本研究分析同步和异步分布式随机梯度下降算法的误差和训练时间之间的权衡,考虑到随机拖延延迟,提出了逐渐变化同步性的方法,并在 CIFAR10 数据集上表现良好。
Abstract
distributed stochastic gradient descent
(SGD) when run in a
synchronous
manner, suffers from delays in runtime as it waits for the slowest workers (
→