BriefGPT.xyz
Jun, 2020
视频理解作为机器翻译
Video Understanding as Machine Translation
HTML
PDF
Bruno Korbar, Fabio Petroni, Rohit Girdhar, Lorenzo Torresani
TL;DR
本文介绍了在大规模多模式视频数据集上的自我监督学习的发展;提出了一种基于生成模型的方法,以翻译问题的形式解决了这一问题,并将其应用于多种下游视频理解任务中。结果表明,本方法在性能上优于基于对比度度量学习的方法。
Abstract
With the advent of
large-scale multimodal video datasets
, especially sequences with audio or transcribed speech, there has been a growing interest in
self-supervised learning
of
→