May, 2023
COLA:如何将视觉语言模型适应对象属性本地化组合?
COLA: How to adapt vision-language models to Compose Objects Localized with Attributes?
Arijit Ray, Filip Radenovic, Abhimanyu Dubey, Bryan A. Plummer, Ranjay Krishna...
TL;DR通过设计 Cola 基准测试,探索了 6 种微调策略,发现一种轻量级的多模态适配器优于常见策略,可在预训练模型生成的图像和语言特征上联合关注。