BriefGPT.xyz
Nov, 2024
通过行动进行预测:通过联合去噪过程进行视觉策略学习
Prediction with Action: Visual Policy Learning via Joint Denoising Process
HTML
PDF
Yanjiang Guo, Yucheng Hu, Jianke Zhang, Yen-Jen Wang, Xiaoyu Chen...
TL;DR
本研究解决了传统的图像预测和机器人动作生成之间的差距,提出了PAD框架,将图像预测与机器人行动统一于一个共同的去噪过程中。通过使用扩散变换器,PAD能够同时预测未来图像和机器人动作,显著提高了机器人控制任务的性能,并在真实场景中展现出强大的泛化能力。
Abstract
Diffusion Models
have demonstrated remarkable capabilities in image generation tasks, including image editing and video creation, representing a good understanding of the physical world. On the other line,
Diffusion Mod
→