BriefGPT.xyz
Jul, 2024
通过提示对齐调整视觉-语言模型的候选标签
Tuning Vision-Language Models with Candidate Labels by Prompt Alignment
HTML
PDF
Zhifang Zhang, Beibei Li
TL;DR
我们提出了一种框架,通过使用候选标签对VLM进行prompt learning,根据模型输出和类别后验预测,结合可学习和手工构建的提示方式来消除标签的歧义,并引入了不同的训练目标,进一步提高了性能。
Abstract
vision-language models
(VLMs) can learn high-quality representations from a large-scale training dataset of image-text pairs.
prompt learning
is a popular approach to fine-tuning VLM to adapt them to downstream t
→