BriefGPT.xyz
May, 2023
TG-VQA:三元游戏视频问答
TG-VQA: Ternary Game of Video Question Answering
HTML
PDF
Hao Li, Peng Jin, Zesen Cheng, Songyang Zhang, Kai Chen...
TL;DR
本研究尝试通过博弈论的交互策略来实现细粒度的视频问答任务中的视觉语义对齐,无需过多的标注,相比现有方法,在长期和短期视频问答数据集上的效果有显著提升,并具有良好的泛化能力和在有限数据上的并行收敛能力。
Abstract
video question answering
aims at answering a question about the video content by reasoning the alignment semantics within them. However, since relying heavily on human instructions, i.e., annotations or priors, current
→