BriefGPT.xyz
Jun, 2024
离散数据的简化和推广掩码扩散
Simplified and Generalized Masked Diffusion for Discrete Data
HTML
PDF
Jiaxin Shi, Kehang Han, Zhe Wang, Arnaud Doucet, Michalis K. Titsias
TL;DR
掩蔽扩散模型是生成离散数据的自回归模型的替代选择,本论文提出了一个简单且通用的框架,解锁了掩蔽扩散模型的全部潜力,并在OpenWebText数据集上训练的模型在困惑度上超过了GPT-2模型,并在5个零-shot语言建模任务中展现出卓越性能,在像素级图像建模中也超过了之前的离散扩散模型。
Abstract
Masked (or absorbing) diffusion is actively explored as an alternative to
autoregressive models
for
generative modeling
of discrete data. However, existing work in this area has been hindered by unnecessarily com
→