Data efficiency of learning, which plays a key role in the Reinforcement Learning (RL) training process, becomes even more important in continual RL with sequential environments. In continual RL, the learner interacts with non-stationary, sequential tasks and is required to learn new tasks without forgetting previous knowledge. However, there is little work on implementing data augmentation for continual RL. In this paper, we investigate the efficacy of data augmentation for continual RL. Specifically, we provide benchmarking data augmentations for continual RL, by (1) summarising existing data augmentation methods and (2) including a new augmentation method for continual RL: Adversarial Augmentation with Gradient Episodic Memory (Adv-GEM). Extensive experiments show that data augmentations, such as random amplitude scaling, state-switch, mixup, adversarial augmentation, and Adv-GEM, can improve existing continual RL algorithms in terms of their average performance, catastrophic forgetting, and forward transfer, on robot control tasks. All data augmentation methods are implemented as plug-in modules for trivial integration into continual RL methods.

本文解决了在连续强化学习中数据效率不足的问题，探索了数据增强技术的有效性。通过总结现有的数据增强方法并提出一种新的对抗增强方法Adv-GEM，实验证明这些数据增强技术能够显著提高传统连续强化学习算法的表现，减少灾难性遗忘，并促进知识的前向转移。

通过对抗梯度情景记忆进行连续强化学习的数据增强