Human motion generation aims to generate natural human pose sequences and shows immense potential for real-world applications. Substantial progress has been made recently in motion data collection technologies and generation methods, laying the foundation for increasing interest in human motion generation. Most research within this field focuses on generating human motions based on conditional signals, such as text, audio, and scene contexts. While significant advancements have been made in recent years, the task continues to pose challenges due to the intricate nature of human motion and its implicit relationship with conditional signals. In this survey, we present a comprehensive literature review of human motion generation, which, to the best of our knowledge, is the first of its kind in this field. We begin by introducing the background of human motion and generative models, followed by an examination of representative methods for three mainstream sub-tasks: text-conditioned, audio-conditioned, and scene-conditioned human motion generation. Additionally, we provide an overview of common datasets and evaluation metrics. Lastly, we discuss open problems and outline potential future research directions. We hope that this survey could provide the community with a comprehensive glimpse of this rapidly evolving field and inspire novel ideas that address the outstanding challenges.

人体运动生成是生成自然人体姿势序列的目标，具有广泛的实际应用潜力。本文是人体运动生成领域的首篇综述文献，介绍了人体运动和生成模型的背景，并对三个主流子任务（文本条件、音频条件和场景条件的人体运动生成）的代表方法进行了审查。此外，还概述了常见数据集和评估指标，并讨论了开放问题和潜在的未来研究方向。希望该综述能够为社区提供对这个快速发展领域的全面了解，并激发解决尚未解决的挑战的新思路。

人体运动生成调查