BriefGPT.xyz
Jun, 2023
基于随机单词和广泛概念的视觉分类性能探讨
Waffling around for Performance: Visual Classification with Random Words and Broad Concepts
HTML
PDF
Karsten Roth, Jae Myung Kim, A. Sophia Koepke, Oriol Vinyals, Cordelia Schmid...
TL;DR
本文提出了一种名为 WaffleCLIP 的框架,通过简单地替换 LLM 生成的描述符为字符和词串,而无需查询外部模型,就在大量的视觉分类任务中实现类似的性能提升,并通过实验研究了 LLM 生成的描述符引入附加语义的影响和缺陷。
Abstract
The
visual classification
performance of
vision-language models
such as CLIP can benefit from additional
semantic knowledge
, e.g. via larg
→