BriefGPT.xyz
Mar, 2021
MDMMT:用于视频检索的多领域多模态Transformer
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
HTML
PDF
Maksim Dzabraev, Maksim Kalashnikov, Stepan Komkov, Aleksandr Petiushko
TL;DR
通过对多个视频字幕数据集的正确组合,我们在MSRVTT和LSMDC基准测试上提出了一种新的文本到视频检索任务的最新技术,成果展示了在无微调的情况下,单一模型在两个数据集上实现了最先进的结果。
Abstract
We present a new state-of-the-art on the
text to video retrieval
task on
msrvtt
and
lsmdc
benchmarks where our model outperforms all previ
→