BriefGPT.xyz
Mar, 2022
去中心化深度学习的本地异步随机梯度下降
Locally Asynchronous Stochastic Gradient Descent for Decentralised Deep Learning
HTML
PDF
Tomer Avidor, Nadav Tal Israel
TL;DR
本文主要介绍分布式深度神经网络训练算法的通信拓扑设计选择及异步去中心化算法如何通过LASGD实现模型同步,实验证明LASGD相较于SGD及业界领先的基于八卦协议的算法加速了大规模图像分类数据集ImageNet的训练速度。
Abstract
distributed training algorithms
of
deep neural networks
show impressive convergence speedup properties on very large problems. However, they inherently suffer from communication related slowdowns and
→