BriefGPT.xyz
Jun, 2020
自适应惯性:解离自适应学习率和动量的影响
Adai: Separating the Effects of Adaptive Learning Rate and Momentum Inertia
HTML
PDF
Zeke Xie, Xinrui Wang, Huishuai Zhang, Issei Sato, Masashi Sugiyama
TL;DR
通过研究神经网络中的优化算法,提出了一个名为“自适应惯性”的新方法,能够更好地训练神经网络并提高泛化性能。
Abstract
Adaptive Momentum Estimation (
adam
), which combines Adaptive Learning Rate and Momentum, is the most popular stochastic optimizer for accelerating training of deep
neural networks
. But
→