BriefGPT.xyz
Mar, 2024
关于连续学习中宽度递减收益的研究
On the Diminishing Returns of Width for Continual Learning
HTML
PDF
Etash Guha, Vihan Lakshman
TL;DR
深度神经网络在各种环境中表现出前沿的性能,但在按顺序训练新任务时往往会出现“灾难性遗忘”。本研究设计了一个框架来分析连续学习理论,并证明网络宽度与遗忘之间存在直接关系。具体而言,我们证明增加网络宽度以减少遗忘产生递减的效果,我们在以前的研究中未曾探索过的宽度范围上通过实验证实了我们理论的预测,清晰地观察到这种递减效果。
Abstract
While
deep neural networks
have demonstrated groundbreaking performance in various settings, these models often suffer from \emph{
catastrophic forgetting
} when trained on new tasks in sequence. Several works have
→