BriefGPT.xyz
Jan, 2025
基于视频推理树的常识视频问答
Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning
HTML
PDF
Huabin Liu, Filip Ilievski, Cees G. M. Snoek
TL;DR
本研究解决了常识视频问答领域中视频与答案之间的虚假关联问题,提出了一种视频基础的推理树方法。该方法通过构建推理树、验证视频语言推理、树推理和动态树扩展四个步骤来实现,并且能够适应现有的视频和图像模型。实验结果表明,该方法在不同基准和推理类型中具有显著影响。
Abstract
This paper proposes the first video-grounded
Entailment Tree
reasoning method for commonsense
Video Question Answering
(VQA). Despite the remarkable progress of large
→