Oct, 2023
从视觉语言模型中提炼,以改善视觉任务中的 OOD 泛化能力
Distilling from Vision-Language Models for Improved OOD Generalization in Vision Tasks
Sravanti Addepalli, Ashish Ramayee Asokan, Lakshay Sharma, R. Venkatesh Babu
TL;DRVision-Language to Vision-Align, Distill, Predict (VL2V-ADiP) is a proposed approach that aligns vision and language modalities to distill pre-trained features and superior generalization for state-of-the-art results in Domain Generalization using Vision-Language Models like CLIP.