边思考边说话：文本生成期间的实时流式语音合成

Sep, 2023

边思考边说话：文本生成期间的实时流式语音合成

Speak While You Think: Streaming Speech Synthesis During Text Generation

Avihu Dekel, Slava Shechtman, Raul Fernandez, David Haws, Zvi Kons...

TL;DRLLM2Speech架构用于通过LLM生成语音，以减少显著的延迟并实现自然对话。

Abstract

large language models (LLMs) demonstrate impressive capabilities, yet interaction with these models is mostly facilitated through text. Using Text-To-Speech to synthesize LLM outputs typically results in notable latency, which is impractical for fluent voice conversations. We propose <