BriefGPT.xyz
May, 2023
基于预训练模型的模块化零样本视觉问答
Modularized Zero-shot VQA with Pre-trained Models
HTML
PDF
Rui Cao, Jing Jiang
TL;DR
本文探讨如何利用预训练模型来支持零样本视觉问答,通过模块化的零样本网络将问题分解成子理性步骤,并将子任务分配给适当的预训练模型以实现更好的可解释性。实验表明,我们的方法比其他基线方法更具有效性和可解释性。
Abstract
Large-scale
pre-trained models
(PTMs) show great zero-shot capabilities. In this paper, we study how to leverage them for
zero-shot visual question answering
(VQA). Our approach is motivated by a few observations
→