BriefGPT.xyz
Nov, 2024
HyperGLM:用于视频场景图生成与预测的超图
HyperGLM: HyperGraph for Video Scene Graph Generation and Anticipation
HTML
PDF
Trong-Thuan Nguyen, Pha Nguyen, Jackson Cothren, Alper Yilmaz, Khoa Luu
TL;DR
本研究解决了现有视频场景图生成方法在处理复杂多对象互动和推理方面的不足。提出的HyperGLM通过构建统一的场景超图,促进多向互动和高阶关系的推理。实验表明,HyperGLM在五项任务中均超越了当前最先进的方法,为视频场景理解提供了更有效的解决方案。
Abstract
Multimodal LLMs
have advanced vision-language tasks but still struggle with understanding video scenes. To bridge this gap,
Video Scene Graph Generation
(VidSGG) has emerged to capture multi-object relationships
→