deep learning researchers commonly suggest that converged models are stuck in local minima. More recently, some researchers observed that under reasonable assumptions, the vast majority of critical points are saddle points, not true minima. Both descriptions suggest that weights conver