In this paper, we address the challenge of generating realistic 3D human motions for action classes that were never seen during the training phase. Our approach involves decomposing complex actions into simpler movements, specifically those observed during training, by leveraging the knowledge of human motion contained in GPTs models. These simpler movements are then combined into a single, realistic animation using the properties of diffusion models. Our claim is that this decomposition and subsequent recombination of simple movements can synthesize an animation that accurately represents the complex input action. This method operates during the inference phase and can be integrated with any pre-trained diffusion model, enabling the synthesis of motion classes not present in the training data. We evaluate our method by dividing two benchmark human motion datasets into basic and complex actions, and then compare its performance against the state-of-the-art.

本研究解决了生成在训练阶段未见动作类别的真实3D人类运动的挑战。我们提出了一种新方法，将复杂动作分解为简单运动，并结合扩散模型的特性来合成真实动画。实验结果表明，该方法在生成不在训练数据中的运动类别方面具有显著的潜力和实用价值。

通过时间和空间的扩散模型组合生成复杂的3D人类运动