BriefGPT.xyz
Mar, 2019
基于草图的高效分布式 SGD 算法
Communication-efficient distributed SGD with Sketching
HTML
PDF
Nikita Ivkin, Daniel Rothchild, Enayat Ullah, Vladimir Braverman, Ion Stoica...
TL;DR
本论文提出了一种名为 Sketched SGD 的算法,通过传递梯度草图而不是完整梯度来执行分布式 SGD,与其他梯度压缩方法相比,Sketched SGD 通过减少通信次数将通信成本降低了约40倍,同时不影响最终模型性能。
Abstract
Large-scale
distributed training
of
neural networks
is often limited by network bandwidth, wherein the communication time overwhelms the local computation time. Motivated by the success of
→