BriefGPT.xyz
May, 2023
可回收的连续预训练调优
Recyclable Tuning for Continual Pre-training
HTML
PDF
Yujia Qin, Cheng Qian, Xu Han, Yankai Lin, Huadong Wang...
TL;DR
本文探讨了在模型不断学习的情况下,对于更新的预训练模型如何对过期的调整权重进行回收利用,提出了初始化和蒸馏两种方法用于解决该问题,提高了模型的收敛速度和性能。
Abstract
continual pre-training
is the paradigm where
pre-trained language models
(PLMs) continually acquire fresh knowledge from growing data and gradually get upgraded. Before an upgraded PLM is released, we may have tu
→