BriefGPT.xyz
Jun, 2023
技能批评家:为强化学习优化学得技能
Skill-Critic: Refining Learned Skills for Reinforcement Learning
HTML
PDF
Ce Hao, Catherine Weaver, Chen Tang, Kenta Kawamoto, Masayoshi Tomizuka...
TL;DR
利用Skill-Critic算法,结合高层技能选择来优化低级和高级策略,通过离线演示数据学习到的潜在空间来指导联合策略优化,提高在多个稀疏环境中的决策性能。
Abstract
hierarchical reinforcement learning
(RL) can accelerate long-horizon decision-making by temporally abstracting a policy into multiple levels. Promising results in
sparse reward
environments have been seen with
→