TL;DR研究了梯度下降算法在同质神经网络中的隐式正则化,重点研究了 optimizing the logistic loss or cross-entropy loss of any homogeneous model,探讨了规范化边缘的平滑版本,形成了一个关于边缘最大化的优化问题,给出了算法的渐进性能, 并讨论了通过训练提高模型鲁棒性的潜在好处。
Abstract
Recent works on implicit regularization have shown that gradient descent converges to the max-margin direction for logistic regression with one-layer or multi-layer linear networks. In this paper, we generalize t