Recent work has shown diffusion models are an effective approach to learning the multimodal distributions arising from demonstration data in behavior cloning. However, a drawback of this approach is the need to learn a denoising function, which is significantly more complex than learning an explicit policy. In this work, we propose Equivariant Diffusion Policy, a novel diffusion policy learning method that leverages domain symmetries to obtain better sample efficiency and generalization in the denoising function. We theoretically analyze the $\mathrm{SO}(2)$ symmetry of full 6-DoF control and characterize when a diffusion model is $\mathrm{SO}(2)$-equivariant. We furthermore evaluate the method empirically on a set of 12 simulation tasks in MimicGen, and show that it obtains a success rate that is, on average, 21.9% higher than the baseline Diffusion Policy. We also evaluate the method on a real-world system to show that effective policies can be learned with relatively few training samples, whereas the baseline Diffusion Policy cannot.

最近的研究表明扩散模型是学习行为克隆中源自示范数据的多模式分布的有效方法，但该方法的缺点在于需要学习一个比学习明确策略更复杂的降噪函数。在本研究中，我们提出了等变扩散策略，这是一种利用域对称性来获得更高样本效率和泛化性能的新型扩散策略学习方法。我们从理论上分析了完整的6自由度控制中的SO(2)对称性，并表征了扩散模型何时是SO(2)等变的。此外，我们在MimicGen的一组12个仿真任务上对该方法进行了实证评估，并显示其成功率平均比基线扩散策略高出21.9%。我们还在一个真实系统上对该方法进行了评估，以表明相对较少的训练样本就可以学习到有效的策略，而基线扩散策略则做不到。