BriefGPT.xyz
Jul, 2023
压缩具有外部分布泛化能力的大型视觉语言模型
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability
HTML
PDF
Xuanlin Li, Yunhao Fang, Minghua Liu, Zhan Ling, Zhuowen Tu...
TL;DR
大规模视觉-语言模型的蒸馏是一个具有潜力的方向,本文研究了利用小型或中型数据集将大型视觉-语言模型的视觉表示转化为轻量级学生模型,提出了两个原则来增强学生模型在开放词汇分布下的泛化能力,并在开放词汇分布下的分类任务中取得了显著改进。
Abstract
Large
vision-language models
have achieved outstanding performance, but their size and computational requirements make their deployment on resource-constrained devices and time-sensitive tasks impractical.
model distill
→