It is arguably believed that flatter minima can generalize better. However, it has been pointed out that the usual definitions of sharpness, which consider either the maxima or the integral of loss over a $\delta$ ball of parameters around minima, cannot give consistent measurement for