BriefGPT.xyz
Jun, 2018
学习视觉知识记忆网络用于视觉问答
Learning Visual Knowledge Memory Networks for Visual Question Answering
HTML
PDF
Zhou Su, Chen Zhu, Yinpeng Dong, Dongqi Cai, Yurong Chen...
TL;DR
本文提出了一种基于VKMN的视觉知识存储网络,通过End-to-End的学习框架将结构化人类知识和深度视觉特征融入到记忆网络中来对抗视觉问答中缺乏对结构化知识的利用的问题,并在VQA 1.0和VQA 2.0基准测试中表现出显著的性能优势,特别是在涉及知识推理的问题方面。
Abstract
visual question answering
(VQA) requires joint comprehension of images and natural language questions, where many questions can't be directly or clearly answered from visual content but require reasoning from
structured
→