Unregularized deep neural networks (DNNs) can be easily overfit with a limited sample size. We argue that this is mostly due to the disriminative nature of DNNs which directly model the conditional probability (or score) of labels given the input. The ignorance of input distribution ma