Apr, 2024
MM-TTS: 多模态、情绪感应文本转语音综合的统一框架
MM-TTS: A Unified Framework for Multimodal, Prompt-Induced Emotional Text-to-Speech Synthesis
Xiang Li, Zhi-Qi Cheng, Jun-Yan He, Xiaojiang Peng, Alexander G. Hauptmann
TL;DRMultimodal Emotional Text-to-Speech System (MM-TTS) is proposed, which leverages emotional cues from multiple modalities, addresses the limitations of current approaches in capturing human emotions, and achieves superior performance compared to traditional Emotional Text-to-Speech models.