The Backprop algorithm for learning in neural networks utilizes two mechanisms: first, stochastic gradient descent and second, initialization with small random weights, where the latter is essential to the effectiveness of the former. We show that in continual learning setups, Backprop performs well initially, but over time its performance degrades. Stochastic gradient descent alone is insufficient to learn continually; the initial randomness enables only initial learning but not continual learning. To the best of our knowledge, ours is the first result showing this degradation in Backprop's ability to learn. To address this issue, we propose an algorithm that continually injects random features alongside gradient descent using a new generate-and-test process. We call this the Continual Backprop algorithm. We show that, unlike Backprop, Continual Backprop is able to continually adapt in both supervised and reinforcement learning problems. We expect that as continual learning becomes more common in future applications, a method like Continual Backprop will be essential where the advantages of random initialization are present throughout learning.

在神经网络的反向传播算法中，初始随机权重对于其整个训练过程至关重要，但在不断学习的设置下，其学习性能会逐渐下降，为了解决这个问题，提出了一种名为 Continual Backprop 的新算法，通过连续注入随机特征，它在有监督和增强学习问题中均能够不断适应。

连续反向传播：带有持续随机性的随机梯度下降