BriefGPT.xyz
Nov, 2023
LLMs作为视觉解释器:通过演变的视觉描述推进图像分类
LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions
HTML
PDF
Songhao Han, Le Zhuo, Yue Liao, Si Liu
TL;DR
将视觉语言模型(VLMs)与大型语言模型(LLMs)相结合的迭代优化与视觉反馈方法,显著提高了图像分类性能,并产生了可解释和稳健的特征描述符。
Abstract
vision-language models
(VLMs) offer a promising paradigm for
image classification
by comparing the similarity between images and class embeddings. A critical challenge lies in crafting precise
→