BriefGPT.xyz
Feb, 2022
贝叶斯非参数方法用于离线技能发现
Bayesian Nonparametrics for Offline Skill Discovery
HTML
PDF
Valentin Villecroze, Harry J. Braviner, Panteha Naderian, Chris J. Maddison, Gabriel Loaiza-Ganem
TL;DR
本篇文章中,我们提出了一种基于离线学习的技能框架,并探索了贝叶斯非参数化与离线技能发现之间的未知联系,提出了一种无需指定技能数目的非参数化方法,结果表明该方法可以在各种环境下优于现有的离线技能学习算法。
Abstract
Skills or low-level policies in
reinforcement learning
are temporally extended actions that can speed up learning and enable complex behaviours. Recent work in offline
reinforcement learning
and imitation learnin
→