递归子查询构建提升单阶段视觉定位

Aug, 2020

递归子查询构建提升单阶段视觉定位

Improving One-stage Visual Grounding by Recursive Sub-query Construction

Zhengyuan Yang, Tianlang Chen, Liwei Wang, Jiebo Luo

TL;DR提出一种递归子查询构建框架，解决当前一阶段视觉基础的限制，提高了长而复杂查询的精度，效果比现有一阶段基线模型在多个基准数据集上都有显著的提高。

Abstract

We improve one-stage visual grounding by addressing current limitations on grounding long and complex queries. Existing one-stage methods encode the entire language query as a single sentence embedding vector, e.g., taking the embedding from BERT or the hidden state from LSTM. This sin