In recent years, the success of deep learning has inspired many researchers
to study the optimization of general smooth non-convex functions. However,
recent works have established pessimistic worst-case complexities for this
class functions, which is in stark contrast with their super