BriefGPT.xyz
Aug, 2020
基于跨模态知识推理的基于知识的视觉问答
Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering
HTML
PDF
Jing Yu, Zihao Zhu, Yujing Wang, Weifeng Zhang, Yue Hu...
TL;DR
本文提出了基于多个知识图谱的知识的视觉问答模型,通过串联的 GRUC 模块,对不同模态的图像信息进行并行推理,最终利用图神经网络获得全局最优解,在三个流行基准数据集上获得新的 state-of-the-art 表现结果。
Abstract
knowledge-based visual question answering
(KVQA) requires
external knowledge
beyond the visible content to answer questions about an image. This ability is challenging but indispensable to achieve general VQA. On
→