BriefGPT.xyz
Feb, 2019
视觉对话中基于双重注意力机制的视觉参考解析网络
Dual Attention Networks for Visual Reference Resolution in Visual Dialog
HTML
PDF
Gi-Cheon Kang, Jaeseo Lim, Byoung-Tak Zhang
TL;DR
本文通过引入Dual Attention Networks (DAN)模型,提出了一种计算机视觉任务解决方法,用于对话历史和图像特征的信息匹配,通过考虑上下文信息和自我注意力机制的学习,解决了视觉指代消解问题,并在多个数据集上得到了显著的表现提升。
Abstract
visual dialog
(
visdial
) is a task which requires an
ai agent
to answer a series of questions grounded in an image. Unlike in visual questi
→