Intelligent creatures can explore their environments and learn useful skills without supervision. In this paper, we propose DIAYN ("Diversity is All You Need"), a method for learning useful skills without a reward function. Our proposed method learns skills by maximizing an information theoretic objective using a maximum entropy policy. On a variety of simulated robotic tasks, we show that this simple objective results in the unsupervised emergence of diverse skills, such as walking and jumping. In a number of reinforcement learning benchmark environments, our method is able to learn a skill that solves the benchmark task despite never receiving the true task reward. In these environments, some of the learned skills correspond to solving the task, and each skill that solves the task does so in a distinct manner. Our results suggest that unsupervised discovery of skills can serve as an effective pretraining mechanism for overcoming challenges of exploration and data efficiency in reinforcement learning

本文提出了一种无需奖励函数却能学习有用技能的方法DIAYN（“多样性就是你所需的一切”），其通过最大化信息理论目标来实现技能的学习，在多项模拟机器人任务中取得了良好的表现，并且能够服务于其它强化学习相关的挑战。

多样性即是你所需：无需奖励函数学习技能