分析视觉问答模型的行为

Jun, 2016

Analyzing the Behavior of Visual Question Answering Models

Aishwarya Agrawal, Dhruv Batra, Devi Parikh

TL;DR本文研究了基于深度学习模型的视觉问答模型，发现现有模型的准确率在60-70％之间，且本文提出系统分析这些模型行为的方法，发现这些模型存在缺点，包括不够全面、容易得出错误答案和不易更正的问题。

Abstract

Recently, a number of deep-learning based models have been proposed for the task of visual question answering (vqa). The performance of most models is clustered around 60-70%. In this paper we propose systematic