BriefGPT.xyz
May, 2016
异步孕育势能,应用于深度学习
Asynchrony begets Momentum, with an Application to Deep Learning
HTML
PDF
Ioannis Mitliagkas, Ce Zhang, Stefan Hadjis, Christopher Ré
TL;DR
本文证明,异步优化算法中添加一类类动量项,可加速训练多层神经网络,对于卷积神经网络,异步度与动量呈直线关系,故在异步执行时,动量调整得当可提升算法效率,反之也可采用相反的动量来改善结果。
Abstract
asynchronous methods
are widely used in deep learning, but have limited theoretical justification when applied to non-convex problems. We give a simple argument that running
stochastic gradient descent
(SGD) in a
→