Recent empirical evidence indicates that many machine learning applications involve heavy-tailed gradient noise, which challenges the standard assumptions of bounded variance in stochastic optimization. Gradient Clipping has emerged as a popular tool to handle this heavy-tailed noise,