BriefGPT.xyz
Jul, 2023
LOIS:视觉问答中的实例语义观察
LOIS: Looking Out of Instance Semantics for Visual Question Answering
HTML
PDF
Siyu Zhang, Yeming Chen, Yaoru Sun, Fang Wang, Haibo Shi...
TL;DR
我们提出了一种不使用边界框的细化模型框架(LOIS)来解决视觉问题回答中关于对象语义因果关系的挑战,并通过两种关系注意力模块来处理实例遮罩引起的标签歧义。实验证明,我们的方法在改进视觉推理能力方面具有良好的性能。
Abstract
visual question answering
(VQA) has been intensively studied as a multimodal task that requires effort in bridging vision and language to infer answers correctly. Recent attempts have developed various
attention-based m
→