Chaofeng Chen, Annan Wang, Haoning Wu, Liang Liao, Wenxiu Sun...
TL;DR通过强化学习对文本编码器进行微调,可以提高文本与图像之间的对齐效果,从而提升图像质量。
Abstract
text-to-image diffusion models are typically trained to optimize the log-likelihood objective, which presents challenges in meeting specific requirements for downstream tasks, such as image aesthetics and image-text alignment. Recent research addresses this issue by refining the diffus