BriefGPT.xyz
Jun, 2024
简单有效的遮掩扩散语言模型
Simple and Effective Masked Diffusion Language Models
HTML
PDF
Subham Sekhar Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin...
TL;DR
简化的掩码离散扩散模型在语言建模方面的性能优于以往认为的水平,可以用于训练只有编码器的高效采样语言模型,并在语言建模基准测试中取得了最新的最佳结果。
Abstract
While
diffusion models
excel at generating high-quality images, prior work reports a significant performance gap between diffusion and autoregressive (AR) methods in language modeling. In this work, we show that simple
→