BriefGPT.xyz
Mar, 2016
基于属性和外部知识的图像字幕和视觉问答
Image Captioning and Visual Question Answering Based on Attributes and Their Related External Knowledge
HTML
PDF
Qi Wu, Chunhua Shen, Anton van den Hengel, Peng Wang, Anthony Dick
TL;DR
本文提出了在成功的卷积神经网络-循环神经网络方法中加入高级概念的方法,并证明其在图像字幕和视觉问答中取得了显著的改进。 该机制还可用于合并外部知识,特别是允许在图像中回答有关内容的问题,即使图像本身不能提供完整答案。
Abstract
Much recent progress in
vision-to-language
problems has been achieved through a combination of
convolutional neural networks
(CNNs) and
recurrent
→