BriefGPT.xyz
Mar, 2025
块扩散:在自回归和扩散语言模型之间插值
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
HTML
PDF
Marianne Arriola, Aaron Gokaslan, Justin T Chiu, Zhihan Yang, Zhixuan Qi...
TL;DR
本研究针对扩散语言模型在似然建模和固定长度生成方面的局限性,提出了一类块扩散语言模型,能够实现灵活长度的生成并提高推理效率。研究表明,块扩散模型在语言建模基准测试中设置了新的最先进性能,并支持生成任意长度的序列。
Abstract
Diffusion
Language Models
offer unique benefits over
Autoregressive Models
due to their potential for parallelized generation and controllability, yet they lag in likelihood modeling and are limited to fixed-leng
→