BriefGPT.xyz
Mar, 2024
桥接不同的语言模型和生成视觉模型用于文本到图像生成
Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
HTML
PDF
Shihao Zhao, Shaozhe Hao, Bojia Zi, Huaizhe Xu, Kwan-Yee K. Wong
TL;DR
通过整合进阶的语言模型和生成式视觉模型,本研究提出了LaVi-Bridge管道,用于实现文本到图像的生成,证明这种整合可以显著改进文本对齐和图像质量等性能。
Abstract
text-to-image generation
has made significant advancements with the introduction of text-to-image
diffusion models
. These models typically consist of a language model that interprets user prompts and a vision mod
→