BriefGPT.xyz
May, 2022
利用扩散计划实现灵活的行为合成
Planning with Diffusion for Flexible Behavior Synthesis
HTML
PDF
Michael Janner, Yilun Du, Joshua B. Tenenbaum, Sergey Levine
TL;DR
本文通过扩展动力学模型,利用扩散概率模型去掉了传统轨迹优化方法的瓶颈,将采样和计划步骤近乎完全融合,通过分类器和图像插值获得了在线规划策略,并在长期决策和测试时间灵活性强的控制环境中成功应用了该框架。
Abstract
model-based reinforcement learning
methods often use learning only for the purpose of estimating an approximate
dynamics model
, offloading the rest of the decision-making work to classical trajectory optimizers.
→