BriefGPT.xyz
Sep, 2022
MUST-VQA: 多语言场景文本VQA
MUST-VQA: MUltilingual Scene-text VQA
HTML
PDF
Emanuele Vivoli, Ali Furkan Biten, Andres Mafla, Dimosthenis Karatzas, Lluis Gomez
TL;DR
本文提出了一个用于处理零样本多语言场景文本视觉问答的框架,该框架首先引入了更加通用的MUST-VQA,在受限环境下进行了两种评估场景的讨论,并证明了模型在零样本环境下的可行性,同时进一步展示了将多语言模型适应于STVQA任务的有效性。
Abstract
In this paper, we present a framework for
multilingual scene text visual question answering
that deals with new languages in a
zero-shot
fashion. Specifically, we consider the task of Scene Text Visual Question A
→