BriefGPT.xyz
Sep, 2018
完美匹配:音频-视觉同步的改进型跨模态嵌入
Perfect match: Improved cross-modal embeddings for audio-visual synchronisation
HTML
PDF
Soo-Whan Chung, Joon Son Chung, Hong-Goo Kang
TL;DR
该论文提出了一种新的跨模态嵌入学习策略,通过多路匹配问题学习嵌入,显著提升了音频到视频同步任务的表现,并用学习到的嵌入进行自我监督的视觉语音识别。
Abstract
This paper proposes a new strategy for learning powerful
cross-modal embeddings
for
audio-to-video synchronization
. Here, we set up the problem as one of
→