BriefGPT.xyz
Mar, 2022
Pseudo-Q:生成视觉定位的伪语言查询
Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
HTML
PDF
Haojun Jiang, Yuanze Lin, Dongchen Han, Shiji Song, Gao Huang
TL;DR
本文提出一种名为Pseudo-Q的新颖方法来自动生成替代人工标注的伪语言查询,以此实现视觉定位目标的目的,通过任务相关的查询提示模块和跨模态多级注意力机制发展视觉语言模型。实验结果表明,该方法可大幅降低人力成本,同时表现出优异的弱监督式视觉定位性能。
Abstract
visual grounding
, i.e., localizing objects in images according to natural
language queries
, is an important topic in visual language understanding. The most effective approaches for this task are based on
→