BriefGPT.xyz
Sep, 2024
通过简单参数高效修改对视觉-语言模型进行微调
Vision-Language Model Fine-Tuning via Simple Parameter-Efficient Modification
HTML
PDF
Ming Li, Jike Zhong, Chenxin Li, Liuzhuozheng Li, Nie Lin...
TL;DR
本研究解决了视觉-语言模型(VLM)微调中经典方法的忽视问题,提出了一种新的视角,即只微调特定参数可以充分发挥经典微调的优势。我们提出的ClipFit方法通过仅调整特定的偏置项和归一化层,提高了零-shot CLIP的平均调和均值准确率7.27%。
Abstract
Recent advances in
fine-tuning
Vision-Language Models
(VLMs) have witnessed the success of prompt tuning and adapter tuning, while the classic model
→