BriefGPT.xyz
Jan, 2022
MVPTR: 多阶段学习的视觉语言预训练中的多级语义对齐
MVP: Multi-Stage Vision-Language Pre-Training via Multi-Level Semantic Alignment
HTML
PDF
Zejun Li, Zhihao Fan, Huaixiao Tou, Zhongyu Wei
TL;DR
本文提出了一种基于多层语义对齐的视觉语言预训练(MVPTR)方法,通过内部多层次表示学习和不同粒度的跨模态语义对齐任务来学习概念表示,强调多模态、多层次的学习能够协同促进表示学习。
Abstract
In this paper, we propose a Multi-stage
vision-language pre-training
(MVP) framework to learn cross-modality representation via
multi-level semantic alignment
. We introduce concepts in both modalities to construc
→