BriefGPT.xyz
Apr, 2022
电影理解的分层自监督表征学习
Hierarchical Self-supervised Representation Learning for Movie Understanding
HTML
PDF
Fanyi Xiao, Kaustav Kundu, Joseph Tighe, Davide Modolo
TL;DR
本文介绍了一种面向电影理解的自监督视频学习方法,采用分层的预训练策略,在低层进行对比学习,高层则采用事件遮罩预测任务来预训练视频上下文模型,并在VidSitu基准测试中表现出更好的性能。同时,在LVU任务中,我们还展示了上下文化事件特征的有效性。
Abstract
Most self-supervised video representation learning approaches focus on action recognition. In contrast, in this paper we focus on
self-supervised video learning
for
movie understanding
and propose a novel hierarc
→