Hierarchical Reinforcement Learning (HRL) has made notable progress in complex control tasks by leveraging temporal abstraction. However, previous HRL algorithms often suffer from serious data inefficiency as environments get large. The extended components, $i.e.$, goal space and length of episodes, impose a burden on either one or both high-level and low-level policies since both levels share the total horizon of the episode. In this paper, we present a method of Decoupling Horizons Using a Graph in Hierarchical Reinforcement Learning (DHRL) which can alleviate this problem by decoupling the horizons of high-level and low-level policies and bridging the gap between the length of both horizons using a graph. DHRL provides a freely stretchable high-level action interval, which facilitates longer temporal abstraction and faster training in complex tasks. Our method outperforms state-of-the-art HRL algorithms in typical HRL environments. Moreover, DHRL achieves long and complex locomotion and manipulation tasks.

本文提出了使用图形解耦合高层次和低层次策略视野的分层强化学习方法，该方法可以使高层次策略的操作间隔更加灵活，从而实现更长的时域抽象和更快的训练。与现有的分层强化学习算法相比，该方法在数据效率方面表现更好，在典型的分层强化学习环境中实现了复杂任务的长期和复杂的运动和操作。

DHRL: 一种基于图的长时间和稀疏层次强化学习方法