BriefGPT.xyz
Apr, 2023
为视觉和语言模型命名类别的学习
Learning to Name Classes for Vision and Language Models
HTML
PDF
Sarah Parisot, Yongxin Yang, Steven McDonagh
TL;DR
使用可用数据为每个类学习最佳词嵌入作为视觉内容的函数,以此来解决零样本识别对手工类名的高度敏感以及适应新、较小数据集的困难。我们证明,该解决方案可以轻松集成在图像分类和物体检测管道中,在多种情况下产生显著的性能增益,并提供模型偏差和标注误差的见解。
Abstract
Large scale
vision and language models
can achieve impressive
zero-shot recognition
performance by mapping class specific text queries to image content. Two distinct challenges that remain however, are high sensi
→