TL;DR该研究提出了使用Domain-agnostic Video Discriminator (DVD)的方法,通过对分类视频完成相同任务的数据进行学习,实现多任务奖励功能的广义推理。通过将人类数据集与机器人数据相结合,该方法可以在未知环境中实现机器人操作任务的成功执行。
Abstract
We are motivated by the goal of generalist robots that can complete a wide range of tasks across many environments. Critical to this is the robot's ability to acquire some metric of task success or reward, which