BriefGPT.xyz
May, 2018
R-VQA: 通过语义关注学习视觉关系事实用于视觉问答
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
HTML
PDF
Pan Lu, Lei Ji, Wei Zhang, Nan Duan, Ming Zhou...
TL;DR
通过构建 Relation-VQA 数据集,并采用新颖的多步注意力模型,该论文提出了一种更好地利用图像语义知识的视觉关系事实学习框架,从而在视觉问答任务中取得了最先进的性能。
Abstract
Recently,
visual question answering
(VQA) has emerged as one of the most significant tasks in
multimodal learning
as it requires understanding both visual and textual modalities. Existing methods mainly rely on e
→