BriefGPT.xyz
Nov, 2023
对比式视觉-语言对齐提高指示学习效率
Contrastive Vision-Language Alignment Makes Efficient Instruction Learner
HTML
PDF
Lizhao Liu, Xinyu Sun, Tianhang Xiang, Zhuangwei Zhuang, Liuren Yin...
TL;DR
通过将Contrastive和Generative方法应用于ViT和LLM的表示对齐,我们提出了CG-VLM模型,有效地实现了视觉-语言的对齐,成为一种高效的指令学习器。
Abstract
We study the task of extending the
large language model
(LLM) into a
vision-language instruction-following
model. This task is crucial but challenging since the LLM is trained on text modality only, making it har
→