生成三维场景中问题的上下文感知自然答案

Oct, 2023

生成三维场景中问题的上下文感知自然答案

Generating Context-Aware Natural Answers for Questions in 3D Scenes

Mohammed Munzer Dwedari, Matthias Niessner, Dave Zhenyu Chen

TL;DR在3D视觉语言的年轻领域中，我们将问题回答的任务转变为序列生成任务，以生成自由形式的自然答案来回答3D场景中的问题（Gen3DQA）。我们直接优化我们的模型以获得全局句子语义，并使用一种实用的语言理解奖励来进一步提高句子质量。我们的方法在ScanQA基准上达到了新的最佳性能（测试集的CIDEr得分为72.22/66.57）。

Abstract

3d question answering is a young field in 3d vision-language that is yet to be explored. Previous methods are limited to a pre-defined answer space and cannot generate answers naturally. In this work, we pivot th