We introduce FigureQA, a visual reasoning corpus of over one million question-answer pairs grounded in over 100,000 images. The images are synthetic, scientific-style figures from five classes: line plots, dot-line plots, vertical and horizontal bar graphs, and pie charts. We formulate our reasoning task by generating questions from 15 templates; questions concern various relationships between plot elements and examine characteristics like the maximum, the minimum, area-under-the-curve, smoothness, and intersection. To resolve, such questions often require reference to multiple plot elements and synthesis of information distributed spatially throughout a figure. To facilitate the training of machine learning systems, the corpus also includes side data that can be used to formulate auxiliary objectives. In particular, we provide the numerical data used to generate each figure as well as bounding-box annotations for all plot elements. We study the proposed visual reasoning task by training several models, including the recently proposed Relation Network as strong baseline. Preliminary results indicate that the task poses a significant machine learning challenge. We envision FigureQA as a first step to developing models that can intuitively recognize patterns from visual representations of data.

FigureQA是一个视觉推理语料库，包含超过一百万个基于100,000张图像的问题-答案对。图像来自五个类别的合成科学式图形：线图、点线图、垂直和水平条形图以及饼状图。通过从15个模板中生成问题并提供用于训练机器学习模型的附属数据，FigureQA为开发可以直观地识别数据可视化中的模式的模型迈出了第一步。

FigureQA：一份为视觉推理而注释的图像数据集