The success of deep learning has led to a rising interest in the generalization property of the stochastic gradient descent (SGD) method, and stability is one popular approach to study it. Existing works based on stability have studied nonconvex loss functions, but only considered the generalization error of the SGD in expectation. In this paper, we establish various generalization error bounds with probabilistic guarantee for the SGD. Specifically, for both general nonconvex loss functions and gradient dominant loss functions, we characterize the on-average stability of the iterates generated by SGD in terms of the on-average variance of the stochastic gradients. Such characterization leads to improved bounds for the generalization error for SGD. We then study the regularized risk minimization problem with strongly convex regularizers, and obtain improved generalization error bounds for proximal SGD. With strongly convex regularizers, we further establish the generalization error bounds for nonconvex loss functions under proximal SGD with high-probability guarantee, i.e., exponential concentration in probability.

本文探讨了深度学习模型的一种优化方法——随机梯度下降在泛化能力上的稳定性，提出了一种基于梯度方差的稳定性指标，并在此基础上分别分析了常规非凸损失函数、梯度主导性损失函数和带强凸规则化器的问题，得到了一系列改进的泛化误差界。

非凸优化中具有概率保障的随机梯度下降泛化误差界