BriefGPT.xyz
Dec, 2023
理解预训练的视觉语言模型的多模态提示
Understanding the Multi-modal Prompts of the Pre-trained Vision-Language Model
HTML
PDF
Shuailei Ma, Chen-Wei Xie, Ying Wei, Siyang Sun, Jiaqi Fan...
TL;DR
通过对多模态提示的直接分析,我们发现多模态提示主要通过引入可学习的偏差项来改进预训练模型在相应数据集上的识别性能,从而提出了偏差调优的方法,并证明了该方法在数据集分类信息有限的情况下较多模态提示具有更好的效果。
Abstract
prompt learning
has emerged as an efficient alternative for fine-tuning foundational models, such as CLIP, for various
downstream tasks
. However, there is no work that provides a comprehensive explanation for the
→