Jingyi Chen, Ju-Seung Byun, Micha Elsner, Andrew Perrault
TL;DR利用强化学习和人类反馈进行扩散模型的文本转语音合成来生成自然且高质量的语音音频。
Abstract
Recent advancements in generative models have sparked significant interest
within the machine learning community. Particularly, diffusion models have
demonstrated remarkable capabilities in synthesizing images an