BriefGPT.xyz
Feb, 2021
SGD稳定性:紧密度分析和改进的界限
Stability of SGD: Tightness Analysis and Improved Bounds
HTML
PDF
Yikai Zhang, Wenjia Zhang, Sammy Bald, Vamsi Pingali, Chao Chen...
TL;DR
本文研究了随机梯度下降方法在训练大规模机器学习模型中的应用,分析了损失函数和数据分布对其泛化性能的影响,提出了改进的数据相关的上界和下降算法来进一步了解深度网络的泛化能力。
Abstract
stochastic gradient descent
(SGD) based methods have been widely used for training large-scale
machine learning models
that also generalize well in practice. Several explanations have been offered for this
→