BriefGPT.xyz
Dec, 2022
优化过的CLIP模型是高效的视频学习器
Fine-tuned CLIP Models are Efficient Video Learners
HTML
PDF
Hanoona Rasheed, Muhammad Uzair Khattak, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan
TL;DR
论文提出了一种用于显式建模时间序列的新型模块,通过视频精调CLIP模型,可以将图像级别的表示有效地转移到视频领域,取得了良好的实验效果。
Abstract
Large-scale
multi-modal training
with image-text pairs imparts strong generalization to
clip model
. Since training on a similar scale for videos is infeasible, recent approaches focus on the effective transfer of
→