对齐的大型语言模型连续预训练中的遗忘现象研究

Jan, 2024

对齐的大型语言模型连续预训练中的遗忘现象研究

Examining Forgetting in Continual Pre-training of Aligned Large Language Models

Chen-An Li, Hung-Yi Lee

TL;DR连续预训练中灾难性遗忘现象对于已经经过微调的大规模语言模型的影响及重复问题的挑战。

Abstract

Recent advances in large language models (LLMs) have exhibited remarkable proficiency across various tasks. Given the potent applications of LLMs in numerous fields, there has been a surge in LLM development. In developing LLMs, a common practice involves →