Continual learning with deep neural networks presents challenges distinct
from both the fixed-dataset and convex continual learning regimes. One such
challenge is plasticity loss, wherein a neural network trained in an online
fashion displays a degraded ability to fit new tasks. This problem has been
extensively studied in both supervised learning and off-policy reinforcement
learning (RL), where a number of remedies have been proposed. Still, plasticity
loss has received less attention in the on-policy deep RL setting. Here we
perform an extensive set of experiments examining plasticity loss and a variety
of mitigation methods in on-policy deep RL. We demonstrate that plasticity loss
is pervasive under domain shift in this regime, and that a number of methods
developed to resolve it in other settings fail, sometimes even resulting in
performance that is worse than performing no intervention at all. In contrast,
we find that a class of ``regenerative'' methods are able to consistently
mitigate plasticity loss in a variety of contexts, including in gridworld tasks
and more challenging environments like Montezuma's Revenge and ProcGen.

深度神经网络的持续学习面临着与固定数据集和凸连续学习模式不同的挑战，其中一个挑战是可塑性损失，即在线训练的神经网络显示出适应新任务的能力下降。本文通过一系列实验研究了深度增强学习中的可塑性损失和多种缓解方法，并发现在领域转移情况下可塑性损失普遍存在，许多解决方法在这种情境下失败，相反，一类 “再生” 方法能够在各种环境中保持可塑性损失的缓解效果，包括网格世界任务以及像《蒙特祖玛的复仇》和 ProcGen 这样更具挑战性的环境。