TL;DR本文提出了一种新的元强化学习算法,称为Meta Goal-generation for Hierarchical RL (MGHRL),通过学习给定过去经验的高层次元策略来生成子目标,而将如何实现子目标留给独立的强化学习子任务来完成,实验结果表明,该算法可以更有效地从过去的经验进行元学习。
Abstract
meta reinforcement learning (meta-RL) is able to accelerate the acquisition of new tasks by learning from past experience. Current meta-RL methods usually learn to adapt to new tasks by directly optimizing the parameters of policies over primitive actions. However, for complex tasks wh