BriefGPT.xyz
Jan, 2024
用大型语言模型实现细粒度视觉识别的民主化
Democratizing Fine-grained Visual Recognition with Large Language Models
HTML
PDF
Mingxuan Liu, Subhankar Roy, Wenjing Li, Zhun Zhong, Nicu Sebe...
TL;DR
使用大型语言模型作为代理,FineR在语义细分类别推理方面体现出更好性能,优于几种先进的FGVR和语音与视觉助手模型,并展示了在野外和新领域中工作的潜力。
Abstract
Identifying
subordinate-level categories
from images is a longstanding task in computer vision and is referred to as
fine-grained visual recognition
(FGVR). It has tremendous significance in real-world applicatio
→