BriefGPT.xyz
Mar, 2024
EquiAV:利用等变性进行音频视觉对比学习
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
HTML
PDF
Jongsuk Kim, Hyeongkeun Lee, Kyeongha Rho, Junmo Kim, Joon Son Chung
TL;DR
在自监督音频-视觉表示学习方面的最新进展中,引入了EquiAV框架,通过利用等变性来实现音频-视觉对比学习,并通过共享的基于注意力的转换预测器实现特征聚合,从而提供了稳健的监督。EquiAV在各种音频-视觉基准测试中优于之前的工作。
Abstract
Recent advancements in
self-supervised audio-visual representation learning
have demonstrated its potential to capture rich and comprehensive representations. However, despite the advantages of
data augmentation
→