BriefGPT.xyz
Jan, 2024
多模式超图网络的文本视频检索
Text-Video Retrieval via Variational Multi-Modal Hypergraph Networks
HTML
PDF
Qian Li, Lixin Su, Jiashu Zhao, Long Xia, Hengyi Cai...
TL;DR
我们提出了一种基于分块匹配的文本-视频检索方法,通过构建多模态超图和引入变分推断,实现在高阶语义空间中对文本和视频的复杂多元交互进行建模,进而提高检索性能。
Abstract
text-video retrieval
is a challenging task that aims to identify relevant videos given textual queries. Compared to conventional textual retrieval, the main obstacle for
text-video retrieval
is the
→