BriefGPT.xyz
May, 2024
高效的保守世界模型下的模仿学习
Efficient Imitation Learning with Conservative World Models
HTML
PDF
Victor Kolev, Rafael Rafailov, Kyle Hatch, Jiajun Wu, Chelsea Finn
TL;DR
我们通过对专家演示进行政策学习来解决没有奖励函数的问题,并提出了将模仿学习视为微调问题的方法,通过在高维原始像素观测中在Franka Kitchen环境上取得了最新的最佳性能,只需要10个演示且没有奖励标签,同时解决了复杂的灵巧操作任务。
Abstract
We tackle the problem of
policy learning
from
expert demonstrations
without a reward function. A central challenge in this space is that these policies fail upon deployment due to issues of distributional shift,
→