BriefGPT.xyz
May, 2018
SmoothOut: 消除深度学习中的尖峰最小值以提高泛化能力
SmoothOut: Smoothing Out Sharp Minima for Generalization in Large-Batch Deep Learning
HTML
PDF
Wei Wen, Yandan Wang, Feng Yan, Cong Xu, Yiran Chen...
TL;DR
本研究提出了SmoothOut框架,通过注入噪声来平滑深度神经网络中的锐利极小值,从而提高了广义泛化性能。在实验中,SmoothOut和AdaSmoothOut在小批量和大批量训练中均稳定提高了广义化能力。
Abstract
In distributed
deep learning
, a large batch size in
stochastic gradient descent
is required to fully exploit the computing power in distributed systems. However,
→