BriefGPT.xyz
Feb, 2022
VU-BERT:一个视觉对话的统一框架
VU-BERT: A Unified framework for Visual Dialog
HTML
PDF
Tong Ye, Shijing Si, Jianzong Wang, Rui Wang, Ning Cheng...
TL;DR
本文提出了一种名为VU-BERT图文联合嵌入的框架,通过用patch projection获取视觉嵌入来简化模型,从而解决了现有研究中用于建模交互的具有特定模态的模块难以使用的问题,并在可视对话任务上取得了较高的竞争性表现。
Abstract
The
visual dialog
task attempts to train an agent to answer
multi-turn questions
given an image, which requires the deep understanding of interactions between the image and dialog history. Existing researches ten
→