深度网络可通过尖利极小化实现泛化

Mar, 2017

深度网络可通过尖利极小化实现泛化

Sharp Minima Can Generalize For Deep Nets

Laurent Dinh, Razvan Pascanu, Samy Bengio, Yoshua Bengio

TL;DR本文研究探讨深度学习的通用性，以及诸如损失函数的可行性等问题，并对深度网络中的对称性和参数空间等方面进行了分析。

Abstract

Despite their overwhelming capacity to overfit, deep learning architectures tend to generalize relatively well to unseen data, allowing them to be deployed in practice. However, explaining why this is the case is still an open area of research. One standing hypothesis that is gaining p