We introduce marginalization models (MaMs), a new family of generative models for high-dimensional discrete data. They offer scalable and flexible generative modeling with tractable likelihoods by explicitly modeling all induced marginal distributions. Marginalization models enable fast evaluation of arbitrary marginal probabilities with a single forward pass of the neural network, which overcomes a major limitation of methods with exact marginal inference, such as autoregressive models (ARMs). We propose scalable methods for learning the marginals, grounded in the concept of "marginalization self-consistency". Unlike previous methods, MaMs support scalable training of any-order generative models for high-dimensional problems under the setting of energy-based training, where the goal is to match the learned distribution to a given desired probability (specified by an unnormalized (log) probability function such as energy function or reward function). We demonstrate the effectiveness of the proposed model on a variety of discrete data distributions, including binary images, language, physical systems, and molecules, for maximum likelihood and energy-based training settings. MaMs achieve orders of magnitude speedup in evaluating the marginal probabilities on both settings. For energy-based training tasks, MaMs enable any-order generative modeling of high-dimensional problems beyond the capability of previous methods. Code is at https://github.com/PrincetonLIPS/MaM.

介绍了一种名为边缘化模型（MaMs）的新的高维离散数据生成模型，通过明确建模所有诱导边际分布，提供可扩展和灵活的生成建模方法，具有可计算的似然度，并以单次神经网络正向传递的方式快速评估任意边际概率。该模型适用于特定概率（由未归一化（对数）概率函数，如能量函数或奖励函数指定）与学习分布匹配的能量训练任务，并在多个离散数据分布上展示出了显著的性能优势。

生成式边缘化模型