There has been significant progress in creating machine learning models that identify objects in scenes along with their associated attributes and relationships; however, there is a large gap between the best models and human capabilities. One of the major reasons for this gap is the difficulty in collecting sufficient amounts of annotated relations and attributes for training these systems. While some attributes and relations are abundant, the distribution in the natural world and existing datasets is long tailed. In this paper, we address this problem by introducing a novel incremental active learning framework that asks for attributes and relations in visual scenes. While conventional active learning methods ask for labels of specific examples, we flip this framing to allow agents to ask for examples from specific categories. Using this framing, we introduce an active sampling method that asks for examples from the tail of the data distribution and show that it outperforms classical active learning methods on Visual Genome.

本文介绍了一种基于主动学习的方法，通过要求视觉场景中的属性和关系来解决目前机器学习系统训练数据不足的问题，并提出一种从数据分布长尾中获取样本的主动采样方法，证明其在视觉基因组数据集上优于传统的主动学习方法。

可以给一个例子吗？主动学习属性和关系的长尾