基于显著性的序列图像关注与多集合预测

Nov, 2017

基于显著性的序列图像关注与多集合预测

Saliency-based Sequential Image Attention with Multiset Prediction

Sean Welleck, Jialin Mao, Kyunghyun Cho, Zheng Zhang

TL;DR本文提出了一种基于视觉注意力模型的分层视觉架构，包括显著性图和注意机制，用于多标签图像分类。模型支持多集预测，通过强化学习进行训练，支持任意标签排列和一对多预测。实验结果表明，该模型可以实现高精度和高召回率的多标签图像分类和物体定位。

Abstract

Humans process visual scenes selectively and sequentially using attention. Central to models of human visual attention is the saliency map. We propose a hierarchical visual architecture that operates on a