针对不同步视听事件的弱监督表征学习

Apr, 2018

针对不同步视听事件的弱监督表征学习

Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events

Sanjeel Parekh, Slim Essid, Alexey Ozerov, Ngoc Q. K. Duong, Patrick Pérez...

TL;DR本文提出了一种基于多模态学习的新型框架，可以从非同步的音频和视觉事件中学习，用于事件分类和定位。使用该方法可以取得弱标签音频事件视频大规模数据集的最先进结果。

Abstract

audio-visual representation learning is an important task from the perspective of designing machines with the ability to understand complex events. To this end, we propose a novel multimodal framework that instan