BriefGPT.xyz
Nov, 2023
为视觉问答填补图像信息缺口:引导大规模语言模型主动提问
Filling the Image Information Gap for VQA: Prompting Large Language Models to Proactively Ask Questions
HTML
PDF
Ziyue Wang, Chi Chen, Peng Li, Yang Liu
TL;DR
通过设计一种框架,使得大型语言模型能够主动提问以揭示图像中的更多细节,改进了知识驱动的视觉问答任务的性能。
Abstract
large language models
(LLMs) demonstrate impressive reasoning ability and the maintenance of world knowledge not only in natural language tasks, but also in some
vision-language tasks
such as open-domain
→