BriefGPT.xyz
Apr, 2022
面向视觉问答的问题驱动图融合网络
Question-Driven Graph Fusion Network For Visual Question Answering
HTML
PDF
Yuxi Qian, Yuncong Hu, Ruonan Wang, Fangxiang Feng, Xiaojie Wang
TL;DR
提出了QD-GFN方法,利用三个图注意力网络来建立图像中的语义、空间和隐含视觉关系,并引入问题信息指导三个图的聚合过程,采用目标过滤机制消除图像中与问题不相关的对象,实验结果表明QD-GFN优于现有最先进的VQA模型,新的图聚合方法和目标过滤机制对模型的性能提升起到了重要作用。
Abstract
Existing
visual question answering
(VQA) models have explored various visual relationships between objects in the image to answer complex questions, which inevitably introduces irrelevant information brought by inaccurate
→