BriefGPT.xyz
Oct, 2022
神经语言模型中近似短期记忆的特征化
Characterizing Verbatim Short-Term Memory in Neural Language Models
HTML
PDF
Kristijan Armeni, Christopher Honey, Tal Linzen
TL;DR
该研究考察了语言模型在处理文本时是否能够检索先前出现的确切单词,并发现transformers模型可以从第一次出现的名词清单中提取词语的身份和排序,而LSTM模型则更加侧重于先前单词的语义要点,以及其与列表中其他单词的关系。
Abstract
When a language model is trained to predict natural language sequences, its prediction at each moment depends on a representation of
prior context
. What kind of information about the
prior context
can
→