BriefGPT.xyz
May, 2017
机器学习中自适应梯度方法的边际价值
The Marginal Value of Adaptive Gradient Methods in Machine Learning
HTML
PDF
Ashia C. Wilson, Rebecca Roelofs, Mitchell Stern, Nathan Srebro, Benjamin Recht
TL;DR
本文研究了使用自适应优化方法训练深度神经网络的表现,提出了一些简单超参数问题,发现自适应方法得到的结果往往比梯度下降方法差,甚至可能导致结果更糟糕,建议实践者重新考虑使用自适应方法训练神经网络。
Abstract
adaptive optimization methods
, which perform local optimization with a metric constructed from the history of iterates, are becoming increasingly popular for training
deep neural networks
. Examples include AdaGra
→