BriefGPT.xyz
Jul, 2024
通过模型合并减轻语言迁移中的灾难性遗忘
Mitigating Catastrophic Forgetting in Language Transfer via Model Merging
HTML
PDF
Anton Alexandrov, Veselin Raychev, Mark Niklas Müller, Ce Zhang, Martin Vechev...
TL;DR
提出了一种新的适应方法Branch-and-Merge(BaM),通过迭代合并多个模型,在可用训练数据的子集上进行微调,从而减少源领域的遗忘,同时在目标领域保持学习,从而显著降低遗忘并提高目标领域性能。
Abstract
As
open-weight large language models
(LLMs) achieve ever more impressive performances across a wide range of tasks in English, practitioners aim to adapt these models to different languages. However, such
language adapt
→