BriefGPT.xyz
Feb, 2018
signSGD:非凸问题的压缩优化
signSGD: compressed optimisation for non-convex problems
HTML
PDF
Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, Anima Anandkumar
TL;DR
signSGD可通过传输最小批次随机梯度符号来缓解学习分布在多个worker上时的通信效率问题,在实际应用中,其动量对应项能够匹配Adam算法在深层Imagenet模型上的精度和收敛速度。高斯定理证明sign-based优化方法对于通信效率和收敛速度的提高具有巨大潜力。
Abstract
Training large
neural networks
requires distributing learning across multiple workers, where the cost of communicating
gradients
can be a significant bottleneck.
→