BriefGPT.xyz
Jan, 2021
基于解耦式编码-解码网络的视觉-语言预训练中的定时采样
Scheduled Sampling in Vision-Language Pretraining with Decoupled Encoder-Decoder Network
HTML
PDF
Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
TL;DR
本文提出了一种两流解耦设计的编码器-解码器架构,使用预训练的编码器-解码器结构进行同时视觉语言理解和生成预训练,使用预训练策略优化编码器和解码器,具有良好的泛化性能。
Abstract
Despite having impressive vision-language (VL) pretraining with BERT-based encoder for
vl understanding
, the pretraining of a universal encoder-decoder for both
vl understanding
and generation remains challenging
→