BriefGPT.xyz
Mar, 2025
通过深度上下文蒸馏训练即插即用知识模块
Training Plug-n-Play Knowledge Modules with Deep Context Distillation
HTML
PDF
Lucas Caccia, Alan Ansell, Edoardo Ponti, Ivan Vulić, Alessandro Sordoni
TL;DR
本文针对大型语言模型预训练后在低数据场景下动态整合新信息的挑战,提出了一种通过训练文档级知识模块(KMs)来模块化知识的方法。我们的深度上下文蒸馏方法超越了传统的下一个标记预测和预训练技术,实现了更有效的知识存储与整合。
Abstract
Dynamically integrating new or rapidly evolving information after (Large)
Language Model
pre-training remains challenging, particularly in
Low-data Scenarios
or when dealing with private and specialized documents
→