BriefGPT.xyz
Mar, 2024
通过丰富的上下文和区分特征嵌入检索拼接视频
Composed Video Retrieval via Enriched Context and Discriminative Embeddings
HTML
PDF
Omkar Thawakar, Muzammal Naseer, Rao Muhammad Anwer, Salman Khan, Michael Felsberg...
TL;DR
使用详细的语言描述来显式编码特定查询背景信息和学习视觉、文本和视觉文本的判别嵌入,以更准确地检索匹配的目标视频的新型CoVR框架。
Abstract
composed video retrieval
(CoVR) is a challenging problem in computer vision which has recently highlighted the integration of
modification text
with
→