BriefGPT.xyz
Nov, 2018
深度学习音-视觉语音增强的训练目标和目标函数
On Training Targets and Objective Functions for Deep-Learning-Based Audio-Visual Speech Enhancement
HTML
PDF
Daniel Michelsanti, Zheng-Hua Tan, Sigurdur Sigurdsson, Jesper Jensen
TL;DR
研究采用深度学习技术解决音视频语音增强任务时,目标量和目标函数的选择对性能至关重要;本实验研究了一系列不同的目标量和目标函数,结果表明直接估计掩模的方法在估计语音质量和可懂度方面表现最佳。
Abstract
audio-visual speech enhancement
(AV-SE) is the task of improving speech quality and intelligibility in a noisy environment using audio and visual information from a talker. Recently,
deep learning
techniques have
→