BriefGPT.xyz
Jul, 2024
ICCV 2023感知测试挑战的解决方案--任务6--基于视频的问题回答
The Solution for the ICCV 2023 Perception Test Challenge 2023 -- Task 6 -- Grounded videoQA
HTML
PDF
Hailiang Zhang, Dian Chao, Zhihao Guan, Yang Yang
TL;DR
本研究介绍了一种基于视频的问答解决方案,通过将视觉定位和物体跟踪结合,提出了一个两阶段的替代方法,并利用VALOR模型回答问题并生成边界框。
Abstract
In this paper, we introduce a
grounded video question-answering
solution. Our research reveals that the fixed official baseline method for video question answering involves two main steps:
visual grounding
and
→