BriefGPT.xyz
Dec, 2018
递归视觉注意力在视觉对话中的应用
Recursive Visual Attention in Visual Dialog
HTML
PDF
Yulei Niu, Hanwang Zhang, Manli Zhang, Jianhong Zhang, Zhiwu Lu...
TL;DR
本文提出了一种名为Recursive Visual Attention(RvA)的新型注意力机制,用于解决视觉对话中的视觉协同参考问题,并在大规模的VisDial v0.9和v1.0数据集上进行了实验,结果表明RvA不仅超越了现有技术,而且在没有附加注释的情况下实现了合理的递归和可解释的注意力图。
Abstract
visual dialog
is a challenging
vision-language task
, which requires the agent to answer multi-round questions about an image. It typically needs to address two major problems: (1) How to answer visually-grounded
→