BriefGPT.xyz
Oct, 2024
视觉语言模型的主动学习
Active Learning for Vision-Language Models
HTML
PDF
Bardia Safaei, Vishal M. Patel
TL;DR
本研究针对当前视觉语言模型(VLM)在特定计算机视觉任务上表现不及监督深度模型的问题,提出了一种新的主动学习框架,通过从未标记数据中选择少量信息样本进行注释,以提升其零-shot分类性能。实验结果表明,该方法在多个图像分类数据集上优于现有的主动学习方案,显著提高了VLM的零-shot表现。
Abstract
Pre-trained
Vision-Language Models
(VLMs) like CLIP have demonstrated impressive
Zero-shot Performance
on a wide range of downstream computer vision tasks. However, there still exists a considerable performance g
→