T3M：基于语音的文本引导三维人类运动合成

Aug, 2024

T3M：基于语音的文本引导三维人类运动合成

T3M: Text Guided 3D Human Motion Synthesis from Speech

Wenshuo Peng, Kaipeng Zhang, Sai Qian Zhang

TL;DR本研究解决了现有语音驱动三维运动合成方法中仅依赖语音音频导致的不准确和缺乏灵活性的问题。提出的T3M方法通过文本输入实现了对运动合成的精确控制，显著提高了多样性和用户定制化能力。实验结果显示，T3M在定量指标和定性评估上均远超现有最先进的方法，有望在虚拟现实、游戏和电影制作中产生重大影响。

Abstract

Speech-driven 3D Motion Synthesis seeks to create lifelike animations based on human speech, with potential uses in Virtual Reality, gaming, and the film production. Existing approaches reply solely on speech aud