BriefGPT.xyz
Nov, 2022
面向文本的双路由网络用于视觉问答
Text-Aware Dual Routing Network for Visual Question Answering
HTML
PDF
Luoqian Jiang, Yifan He, Jian Chen
TL;DR
提出了一种名为TDR的基于文本感知的双路由神经网络,在视觉问题回答方面取得了优异表现,特别是在与数字相关的问题上。
Abstract
visual question answering
(VQA) is a challenging task to provide an accurate natural language answer given an image and a natural language question about the image. It involves
multi-modal learning
, i.e., compute
→