BriefGPT.xyz
Jun, 2018
自监督同步下的音频和视频模型协同学习
Co-Training of Audio and Video Representations from Self-Supervised Temporal Synchronization
HTML
PDF
Bruno Korbar, Du Tran, Lorenzo Torresani
TL;DR
该研究通过自监督的时间同步学习模型实现音频和视频分析的目的,模型能够在没有微调的情况下有效地识别出时序同步的音频-视频配对,并提供了一种非常有效的初始化方式以改善基于视频的动作识别模型的准确性。
Abstract
There is a natural correlation between the visual and auditive elements of a
video
. In this work we leverage this connection to learn general and effective features for both
audio
and
→