BriefGPT.xyz
Feb, 2024
多批评家技能学习
SLIM: Skill Learning with Multiple Critics
HTML
PDF
David Emukpere, Bingbing Wu, Julien Perez
TL;DR
我们提出了SLIM,一种多批评家学习方法,它通过在演员-评论家框架中优雅地结合多个奖励函数,显著提高了机器人操作的潜在变量技能发现,克服了可能干扰收敛到有用技能的奖励之间的干扰,并展示了在桌面操作中,我们方法在获得安全高效的运动基元方面的适用性,通过规划利用它们,大大超过了技能发现的现有方法。
Abstract
self-supervised skill learning
aims to acquire useful behaviors that leverage the underlying dynamics of the environment.
latent variable models
, based on mutual information maximization, have been particularly s
→