Catastrophic interference is common in many network-based learning systems,
and many proposals exist for mitigating it. Before overcoming interference we
must understand it better. In this work, we provide a definition and novel
measure of interference for value-based reinforcement learning methods such as
Fitted Q-Iteration and DQN. We systematically evaluate our measure of
interference, showing that it correlates with instability in control
performance, across a variety of network architectures. Our new interference
measure allows us to ask novel scientific questions about commonly used deep
learning architectures and study learning algorithms which mitigate
interference. Lastly, we outline a class of algorithms which we call
online-aware that are designed to mitigate interference, and show they do
reduce interference according to our measure and that they improve stability
and performance in several classic control environments.

本文研究了在价值为基础的强化学习方法中广泛存在的灾难性干扰现象，提供了一种新的干扰度量方法，在多种网络架构下系统评估了这种度量与控制性能不稳定性的相关性，并提出了一类名为 “在线感知” 的算法来减少干扰，并表明它们在多个经典的控制环境中可以提高稳定性和性能。