Kirill Gavrilyuk, Mihir Jain, Ilia Karmanov, Cees G. M. Snoek
TL;DR本文介绍了一种称为 MotionFit 的自训练方法,采用 3D 卷积神经网络、运动模型和假标签来提高视频下游任务的效果。该方法在小规模视频数据集中表现优异,远超其他知识迁移算法、半监督学习和仅使用视频的自我监督学习。
Abstract
The goal of this paper is to self-train a 3d convolutional neural network on an unlabeled video collection for deployment on small-scale video collections. As smaller video datasets benefit more from motion than appearance, we strive to train our network using optical flow, but avoid i