BriefGPT.xyz
Sep, 2021
通过多流语料库对齐和双 Softmax 损失来改进视频文本检索
Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss
HTML
PDF
Xing Cheng, Hezheng Lin, Xiangyu Wu, Fan Yang, Dong Shen
TL;DR
本文提出一种基于多流语料库对齐网络和双softmax损失函数的方法(CAMoE和DSL),以解决CLIP模型在视频和文本结构和内容异构性方面过拟合和检索效率相对较差的问题,并在各种基准测试中取得了最先进的成果。
Abstract
Employing large-scale pre-trained model
clip
to conduct
video-text retrieval
task (VTR) has become a new trend, which exceeds previous VTR methods. Though, due to the heterogeneity of structures and contents betw
→