基于声码器的无声视频语音合成

Apr, 2020

Vocoder-Based Speech Synthesis from Silent Videos

Daniel Michelsanti, Olga Slizovskaia, Gloria Haro, Emilia Gómez, Zheng-Hua Tan...

TL;DR本文利用深度学习算法，通过从口型信息中提取语音声学特征进行语音的合成，从而改善无声视频中语音恢复的质量。

Abstract

Both acoustic and visual information influence human perception of speech. For this reason, the lack of audio in a video sequence determines an extremely low speech intelligibility for untrained lip readers. In this paper, we present a way to synthesise speech from the silent video of a talker using →