BriefGPT.xyz
Jan, 2019
视觉蕴涵:一种精细图像理解的新任务
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
HTML
PDF
Ning Xie, Farley Lai, Derek Doran, Asim Kadav
TL;DR
本文介绍了一个新的推理任务Visual Entailment(VE),并构建了一个数据集SNLI-VE,用于评估已有的VQA基线和构建了一个名为EVE的模型来解决VE任务,这个模型达到了71%的准确率并展示了EVE通过跨模态注意力可解释性的效果。
Abstract
Existing
visual reasoning
datasets such as Visual Question Answering (VQA), often suffer from biases conditioned on the question, image or answer distributions. The recently proposed CLEVR
dataset
addresses these
→