Reinforcement learning (RL) agents make decisions using nothing but observations from the environment, and consequently, heavily rely on the representations of those observations. Though some recent breakthroughs have used vector-based categorical representations of observations, often referred to as discrete representations, there is little work explicitly assessing the significance of such a choice. In this work, we provide a thorough empirical investigation of the advantages of representing observations as vectors of categorical values within the context of reinforcement learning. We perform evaluations on world-model learning, model-free RL, and ultimately continual RL problems, where the benefits best align with the needs of the problem setting. We find that, when compared to traditional continuous representations, world models learned over discrete representations accurately model more of the world with less capacity, and that agents trained with discrete representations learn better policies with less data. In the context of continual RL, these benefits translate into faster adapting agents. Additionally, our analysis suggests that the observed performance improvements can be attributed to the information contained within the latent vectors and potentially the encoding of the discrete representation itself.

通过对离散表示法进行彻底的实证研究，我们发现，与传统连续表示法相比，在世界模型学习、无模型强化学习和连续强化学习问题中，将观测数据表示为分类值向量能更准确地模拟世界，并且使用离散表示法训练的智能体能够更好地学习策略和使用更少的数据，在连续强化学习中表现出更快的适应性。此外，我们的分析表明，性能改进可能归因于潜在向量中包含的信息和离散表示本身的编码方式。

利用离散表示进行连续强化学习