连贯的零射视觉指令生成

Jun, 2024

Coherent Zero-Shot Visual Instruction Generation

Quynh Phung, Songwei Ge, Jia-Bin Huang

TL;DR该论文提出了一种简单的、无需训练的框架，通过整合文本理解和图像生成，解决了在生成视觉指令中保持物体的一致性和平滑状态转换的问题，实验证明该方法可以生成一致且具有视觉吸引力的指令。

Abstract

Despite the advances in text-to-image synthesis, particularly with diffusion models, generating visual instructions that require consisten