BriefGPT.xyz
May, 2024
行为克隆策略的一般化程度如何?一种可靠性表现评估的统计方法
How Generalizable Is My Behavior Cloning Policy? A Statistical Approach to Trustworthy Performance Evaluation
HTML
PDF
Joseph A. Vincent, Haruki Nishimura, Masha Itkina, Paarth Shah, Mac Schwager...
TL;DR
通过一小部分策略演练,我们提出了一个为机器人在任意环境中的性能提供严格下界评估的框架,通过应用标准的随机排序来提供性能分布的最差情况边界,并确保边界在用户指定的置信水平和紧凑度上成立。
Abstract
With the rise of
stochastic generative models
in robot policy learning,
end-to-end visuomotor policies
are increasingly successful at solving complex tasks by learning from human demonstrations. Nevertheless, sin
→