BriefGPT.xyz
Dec, 2021
3D问答
3D Question Answering
HTML
PDF
Shuquan Ye, Dongdong Chen, Songfang Han, Jing Liao
TL;DR
本文提出了一种基于Transformer的3D问答框架,名为3DQA-TR,通过利用外观和几何信息对多模态信息进行编码,以实现对于3D领域的问答。同时,作者开发了第一个3DQA数据集“ScanQA”,该数据集包含了大约6K个问题和30K个答案,可用于验证3DQA-TR的有效性。实验结果表明该3DQA框架优于现有的VQA框架且高效设计的效果较好。
Abstract
visual question answering
(VQA) has witnessed tremendous progress in recent years. However, most efforts only focus on the 2D image question answering tasks. In this paper, we present the first attempt at extending VQA to the
→