带剪辑的非凸随机优化的高概率分析

Jul, 2023

High Probability Analysis for Non-Convex Stochastic Optimization with Clipping

Shaojie Li, Yong Liu

TL;DR使用梯度裁剪技术在随机优化算法中研究梯度的截尾行为和其理论保证。

Abstract

gradient clipping is a commonly used technique to stabilize the training process of neural networks. A growing body of studies has shown that gra