完美匹配：音频-视觉同步的改进型跨模态嵌入

Sep, 2018

完美匹配：音频-视觉同步的改进型跨模态嵌入

Perfect match: Improved cross-modal embeddings for audio-visual synchronisation

Soo-Whan Chung, Joon Son Chung, Hong-Goo Kang

TL;DR该论文提出了一种新的跨模态嵌入学习策略，通过多路匹配问题学习嵌入，显著提升了音频到视频同步任务的表现，并用学习到的嵌入进行自我监督的视觉语音识别。

Abstract

This paper proposes a new strategy for learning powerful cross-modal embeddings for audio-to-video synchronization. Here, we set up the problem as one of →