VLP：视觉语言预训练综述

Feb, 2022

VLP: A Survey on Vision-Language Pre-training

Feilong Chen, Duzhan Zhang, Minglun Han, Xiuyi Chen, Jing Shi...

TL;DR本文调查了最近关于视觉-语言预训练 (VLP) 的进展和新前沿。这是第一篇关注VLP的综述文章，并对VLP模型做了具体总结，旨在为VLP领域的未来研究提供启示。

Abstract

In the past few years, the emergence of pre-training models has brought uni-modal fields such as computer vision (CV) and natural language processing (NLP) to a new era. Substantial works have shown they are beneficial for downstream uni-modal tasks and avoid training a new model from