BriefGPT.xyz
Sep, 2019
UNITER: 通用图像文本表示学习
UNITER: Learning UNiversal Image-TExt Representations
HTML
PDF
Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed El Kholy, Faisal Ahmed...
TL;DR
本研究引入了UNITER,一种通过对四个图像-文本数据集(COCO,Visual Genome,Conceptual Captions和SBU Captions)进行大规模预训练学习的UNiversal image-text representation,其可为异构下游V + L任务提供联合多模态嵌入。
Abstract
joint image-text embedding
is the bedrock for most Vision-and-Language (V+L) tasks, where
multimodality inputs
are jointly processed for visual and textual understanding. In this paper, we introduce UNITER, a UNi
→