BriefGPT.xyz
Feb, 2022
通过估计演示者的专业水平进行模仿学习
Imitation Learning by Estimating Expertise of Demonstrators
HTML
PDF
Mark Beliaev, Andy Shih, Stefano Ermon, Dorsa Sadigh, Ramtin Pedarsani
TL;DR
本研究通过对演示者专业技能的无监督学习,开发了一种可同时学习演示者政策和专业技能水平的联合模型,并通过过滤每种演示者的次优行为,训练出可以优于任何演示者的单一策略,并可用于估计任意状态下演示者的专业技能,在Robomimic等实际机器人控制任务以及MiniGrid和棋类等离散环境中取得了比其他方法更好的表现。
Abstract
Many existing
imitation learning
datasets are collected from multiple demonstrators, each with different expertise at different parts of the environment. Yet, standard
imitation learning
algorithms typically trea
→