BriefGPT.xyz
Aug, 2023
提升文本-视觉问答中的文本表达
Making the V in Text-VQA Matter
HTML
PDF
Shamanthak Hegde, Soumya Jahagirdar, Shankar Gangisetty
TL;DR
通过结合TextVQA和VQA数据集,我们提出了一种方法,在文本和图像特征之间增加了理解和关联性,从而提高了对问题的回答准确性。
Abstract
text-based vqa
aims at answering questions by reading the text present in the images. It requires a large amount of
scene-text relationship understanding
compared to the VQA task. Recent studies have shown that t
→