We present neural autoregressive distribution estimation (NADE) models, which are neu- ral network architectures applied to the problem of unsupervised distribution and density esitmation. They leverage the probability product rule and a weight sharing scheme in- spired from