使用深度学习的视觉问答: 调查和性能分析

Aug, 2019

使用深度学习的视觉问答: 调查和性能分析

Visual Question Answering using Deep Learning: A Survey and Performance Analysis

Yash Srivastava, Vaishnav Murali, Shiv Ram Dubey, Snehasis Mukherjee

TL;DR本篇综述介绍了视觉问答（VQA）任务，包括基于自然语言描述的图像识别以及机器学习模型的研究，主要探讨了近期在该领域中公布的数据集、新的深度学习模型以及基于 VQA 模型的一些应用研究和挑战。

Abstract

The visual question answering (VQA) task combines challenges for processing data with both Visual and Linguistic processing, to answer basic `common sense' questions about given images. Given an image and a question in natural language, the VQA system tries to find the correct answer t