BriefGPT.xyz
Nov, 2022
克服语言模型中技能注入的障碍: 以算术为例的案例研究
Overcoming Barriers to Skill Injection in Language Modeling: Case Study in Arithmetic
HTML
PDF
Mandar Sharma, Nikhil Muralidhar, Naren Ramakrishnan
TL;DR
提出一种信息论干预的新型框架,以克服向语言模型注入非语言技能时发生的语言技能灾难性遗忘,从而使语言模型在保留语言能力的同时也具备数学推理的能力。
Abstract
Through their
transfer learning
abilities, highly-parameterized large
pre-trained language models
have dominated the NLP landscape for a multitude of downstream language tasks. Though linguistically proficient, t
→