Learning, taking into account full distribution of the data, referred to as
generative, is not feasible with deep neural networks (DNNs) because they model
only the conditional distribution of the outputs given the inputs. Current
solutions are either based on joint probability models