Unregularized deep neural networks (DNNs) can be easily overfit with a limited sample size. We argue that this is mostly due to the disriminative nature of DNNs which directly model the conditional probability (or score) of labels given the input. The ignorance of input distribution makes DNNs difficult to generalize to unseen data. Recent advances in regularization techniques, such as pretraining and dropout, indicate that modeling input data distribution (either explicitly or implicitly) greatly improves the generalization ability of a DNN. In this work, we explore the manifold hypothesis which assumes that instances within the same class lie in a smooth manifold. We accordingly propose two simple regularizers to a standard discriminative DNN. The first one, named Label-Aware Manifold Regularization, assumes the availability of labels and penalizes large norms of the loss function w.r.t. data points. The second one, named Label-Independent Manifold Regularization, does not use label information and instead penalizes the Frobenius norm of the Jacobian matrix of prediction scores w.r.t. data points, which makes semi-supervised learning possible. We perform extensive control experiments on fully supervised and semi-supervised tasks using the MNIST dataset and set the state-of-the-art results on it.

该研究探讨了深度神经网络中过拟合问题的原因，并提出了基于流形假设的正则化方法，包括有标签和无标签情况下的流形正则化，实验证明这些方法可以显著提高模型泛化性能。

流形正则化鉴别性神经网络