BriefGPT.xyz
Oct, 2021
为持续学习的动态梯度投影记忆受益铺平锋芒
Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning
HTML
PDF
Danruo Deng, Guangyong Chen, Jianye Hao, Qiong Wang, Pheng-Ann Heng
TL;DR
探究权重损失面与稳定性的关系,并基于此提出FS-DGPM方法,使用软权重代表过去任务的重要性,通过降低权重损失面的失真度,提高模型对学习新技能的灵敏度和泛化能力。
Abstract
The
backpropagation networks
are notably susceptible to
catastrophic forgetting
, where networks tend to forget previously learned skills upon learning new ones. To address such the 'sensitivity-stability' dilemma
→