The past several years have seen remarkable progress in generative models which produce convincing samples of images and other modalities. A shared component of many powerful generative models is a decoder network, a parametric deep neural net that defines a generative distribution. Examples include variational autoencoders, generative adversarial networks, and generative moment matching networks. Unfortunately, it can be difficult to quantify the performance of these models because of the intractability of log-likelihood estimation, and inspecting samples can be misleading. We propose to use Annealed Importance Sampling for evaluating log-likelihoods for decoder-based models and validate its accuracy using bidirectional Monte Carlo. Using this technique, we analyze the performance of decoder-based models, the effectiveness of existing log-likelihood estimators, the degree of overfitting, and the degree to which these models miss important modes of the data distribution.

本文提出使用提议的退火重要性抽样方法对基于解码器的模型进行对数似然评估，并使用双向蒙特卡罗验证其精度，分析了解码器模型的性能，现有对数似然估计器的有效性，过拟合程度以及这些模型错过数据分布的重要模式情况。

解码器生成模型的定量分析