BriefGPT.xyz
Dec, 2019
视觉对话的大规模预训练: 一个简单的最先进基准线
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
HTML
PDF
Vishvak Murahari, Dhruv Batra, Devi Parikh, Abhishek Das
TL;DR
本文提出了一种基于ViLBERT的方法,该方法采用与Visual Dialog相关的视觉语言数据集的预训练,随后转移到Visual Dialog的训练上。文中还发现,在Visual Dialog中使用密集注释进行微调,可以提高NDCG,但会降低MRR。
Abstract
Prior work in
visual dialog
has focused on training deep
neural models
on the VisDial dataset in isolation, which has led to great progress, but is limiting and wasteful. In this work, following recent trends in
→