BriefGPT.xyz
Jun, 2019
分层强化学习中的子策略适应
Sub-policy Adaptation for Hierarchical Reinforcement Learning
HTML
PDF
Alexander C. Li, Carlos Florensa, Ignasi Clavera, Pieter Abbeel
TL;DR
本文提出了一种新的分层强化学习算法HiPPO,它可以根据新任务的训练不断调整技能并与更高层次一起训练,该算法引入了一个无偏差的潜变量依赖基准的分层策略梯度,并提出了一种训练时抽象方法,以提高所获得技能对环境变化的鲁棒性。
Abstract
hierarchical reinforcement learning
is a promising approach to long-horizon decision-making problems with sparse rewards. Unfortunately, most methods still decouple the lower-level
skill acquisition
process and t
→