Imitation Learning (IL) aims to discover a policy by minimizing the discrepancy between the agent's behavior and expert demonstrations. However, IL is susceptible to limitations imposed by noisy demonstrations from non-expert behaviors, presenting a significant challenge due to the lack of supplementary information to assess their expertise. In this paper, we introduce Self-Motivated Imitation LEarning (SMILE), a method capable of progressively filtering out demonstrations collected by policies deemed inferior to the current policy, eliminating the need for additional information. We utilize the forward and reverse processes of Diffusion Models to emulate the shift in demonstration expertise from low to high and vice versa, thereby extracting the noise information that diffuses expertise. Then, the noise information is leveraged to predict the diffusion steps between the current policy and demonstrators, which we theoretically demonstrate its equivalence to their expertise gap. We further explain in detail how the predicted diffusion steps are applied to filter out noisy demonstrations in a self-motivated manner and provide its theoretical grounds. Through empirical evaluations on MuJoCo tasks, we demonstrate that our method is proficient in learning the expert policy amidst noisy demonstrations, and effectively filters out demonstrations with expertise inferior to the current policy.

自我激励仿真学习 (SMILE) 是一种逐步过滤出被当前策略认为低劣的策略收集的演示的方法，利用扩散模型的正向和逆向过程模拟从低到高和从高到低的演示专业知识的转变，并利用噪声信息预测当前策略和演示者之间的扩散步骤，进一步详细说明了如何自我激励地应用预测的扩散步骤来过滤嘈杂的演示，并提供了其理论基础。通过对MuJoCo任务的实证评估，我们证明了我们的方法能够在嘈杂的演示环境中学习到专家策略，并有效地过滤掉低于当前策略的演示。

自我激励模仿学习：噪声演示的优化