BriefGPT.xyz
Jun, 2024
MSRS: 用稀疏掩码优化从零开始训练多模态语音识别模型
MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
HTML
PDF
Adriana Fernandez-Lopez, Honglie Chen, Pingchuan Ma, Lu Yin, Qiao Xiao...
TL;DR
该研究提出了一种正则化技术,可以从头开始训练视觉和视听语音识别模型,通过学习稀疏结构并减少训练时间,同时达到竞争性的识别结果。
Abstract
pre-trained models
have been a foundational approach in
speech recognition
, albeit with associated additional costs. In this study, we propose a
→