BriefGPT.xyz
Oct, 2023
MiCRO:用于扩展和加速分布式DNN训练的几乎零成本梯度稀疏化
MiCRO: Near-Zero Cost Gradient Sparsification for Scaling and Accelerating Distributed DNN Training
HTML
PDF
Daegun Yoon, Sangyoon Oh
TL;DR
MiCRO是一种新颖的梯度稀疏化方法,通过解决影响分布式深度神经网络训练可扩展性和加速度的问题,实现了接近零成本的梯度稀疏化,并具有出色的收敛速度。
Abstract
gradient sparsification
is a
communication optimisation
technique for scaling and accelerating distributed deep neural network (DNN) training. It reduces the increasing communication traffic for gradient aggregat
→