BriefGPT.xyz
Oct, 2021
分布式深度训练中指数图的可证明高效性
Exponential Graph is Provably Efficient for Decentralized Deep Training
HTML
PDF
Bicheng Ying, Kun Yuan, Yiming Chen, Hanbin Hu, Pan Pan...
TL;DR
研究了分布式随机梯度下降算法的新方法:使用指数图作为通信拓扑,以实现高效的精确平均和较少的通信成本,进而提高训练速度和质量。
Abstract
decentralized sgd
is an emerging training method for deep learning known for its much less (thus faster) communication per iteration, which relaxes the
averaging
step in parallel SGD to inexact
→