BriefGPT.xyz
Jun, 2020
STL-SGD:针对阶段通信周期的本地 SGD 加速
STL-SGD: Speeding Up Local SGD with Stagewise Communication Period
HTML
PDF
Shuheng Shen, Yifei Cheng, Jingchang Liu, Linli Xu
TL;DR
本文提出了一种称为STL-SGD算法的分布式机器学习算法,通过逐渐增加通信周期来降低通信复杂度并加速收敛速度,证明其具有与mini-batch SGD相同的收敛速度和线性加速,且在强凸或满足Polyak-Lijasiewicz条件的情况下具有较大的优势。
Abstract
Distributed parallel stochastic gradient descent algorithms are workhorses for large scale
machine learning
tasks. Among them, local stochastic gradient descent (
local sgd
) has attracted significant attention due
→