BriefGPT.xyz
Jan, 2024
视觉-语言模型中被忽视的尾部
The Neglected Tails of Vision-Language Models
HTML
PDF
Shubham Parashar, Zhiqiu Lin, Tian Liu, Xiangjue Dong, Yanan Li...
TL;DR
视觉语言模型(VLM)在零射击识别方面表现出色,但在视觉概念上的性能相差巨大。我们的工作首次尝试通过分析预训练文本来测量概念频率,并提出了一种减轻VLM在零射击识别中不平衡性能的方法REtrieval-Augmented Learning REAL。
Abstract
vision-language models
(VLMs) excel in
zero-shot recognition
but exhibit drastically
imbalanced performance
across visual concepts. For ex
→