BriefGPT.xyz
Jan, 2021
视觉和语言推理:探索补充知识的益处
Reasoning over Vision and Language: Exploring the Benefits of Supplemental Knowledge
HTML
PDF
Violetta Shevchenko, Damien Teney, Anthony Dick, Anton van den Hengel
TL;DR
本文研究了将通用知识库中的知识注入视觉-语言模型中,并通过辅助训练目标增加了语义和关系知识的表征,实现了对问题回答、视觉推理等任务中的性能提升,这种技术不依赖于特定的模型,具有较小的计算开销。
Abstract
The limits of applicability of
vision-and-language models
are defined by the coverage of their training data. Tasks like vision
question answering
(VQA) often require commonsense and factual information beyond wh
→