BriefGPT.xyz
Mar, 2022
记忆 Transformer
Memorizing Transformers
HTML
PDF
Yuhuai Wu, Markus N. Rabe, DeLesley Hutchins, Christian Szegedy
TL;DR
本文介绍了一种使用内部储存器实现直接读取并记忆新数据的语言模型,在多个基准测试和任务中展示了近似kNN查找技术,着重测试了代码和数学等领域,并证明了随着储存器大小的增加,性能将稳步提高。
Abstract
language models
typically need to be trained or finetuned in order to acquire
new knowledge
, which involves updating their weights. We instead envision
→