BriefGPT.xyz
Oct, 2020
基于EM的可证明分层模仿学习
Provable Hierarchical Imitation Learning via EM
HTML
PDF
Zhiyu Zhang, Ioannis Paschalidis
TL;DR
本文利用潜在变量模型将层次化模仿学习问题转化为参数推断,理论上表征了Daniel等人(2016)提出的EM方法。研究了种群水平算法作为中间步骤的性能保证,证明了该算法在一定的正则条件下以高概率收敛于真实参数周围的范数球上。据我们所知,这是第一个仅观察原始状态-动作对的层次化模仿学习算法的性能保证。
Abstract
Due to recent empirical successes, the
options framework
for
hierarchical reinforcement learning
is gaining increasing popularity. Rather than learning from rewards which suffers from the curse of dimensionality,
→