General purpose agents will require large repertoires of skills. Empowerment -- the maximum mutual information between skills and the states -- provides a pathway for learning large collections of distinct skills, but mutual information is difficult to optimize. We introduce a new framework, Hierarchical Empowerment, that makes computing empowerment more tractable by integrating concepts from Goal-Conditioned Hierarchical Reinforcement Learning. Our framework makes two specific contributions. First, we introduce a new variational lower bound on mutual information that can be used to compute empowerment over short horizons. Second, we introduce a hierarchical architecture for computing empowerment over exponentially longer time scales. We verify the contributions of the framework in a series of simulated robotics tasks. In a popular ant navigation domain, our four level agents are able to learn skills that cover a surface area over two orders of magnitude larger than prior work.

通用目标代理需要大量的技能。我们介绍了一种新的框架，层次赋权，在计算赋权时将自目标条件层次强化学习的概念整合进去，通过引入变分下界和层次结构计算赋权。这个框架的研究证实，我们的四级代理能够学习涵盖比之前工作大两个数量级的技能，以在模拟机器人任务中验证其贡献。

分层赋权: 实现可行的基于赋权的技能学习