Due to the non-smoothness of optimization problems in Machine Learning, generalized smoothness assumptions have been gaining a lot of attention in recent years. One of the most popular assumptions of this type is $(L_0,L_1)$-smoothness (Zhang et al., 2020). In this paper, we focus on the class of (strongly) convex $(L_0,L_1)$-smooth functions and derive new convergence guarantees for several existing methods. In particular, we derive improved convergence rates for Gradient Descent with (Smoothed) Gradient Clipping and for Gradient Descent with Polyak Stepsizes. In contrast to the existing results, our rates do not rely on the standard smoothness assumption and do not suffer from the exponential dependency from the initial distance to the solution. We also extend these results to the stochastic case under the over-parameterization assumption, propose a new accelerated method for convex $(L_0,L_1)$-smooth optimization, and derive new convergence rates for Adaptive Gradient Descent (Malitsky and Mishchenko, 2020).

本研究解决了机器学习中优化问题的非光滑性问题，针对 convex $(L_0,L_1)$-光滑函数提出了新的收敛保证。研究通过改进梯度下降法的收敛速度，提出了一种新的加速方法，并扩展了结果到随机情况下，为自适应梯度下降法提供了新的收敛速率。

凸$(L_0,L_1)$-光滑优化的方法：剪辑、加速与自适应