BriefGPT.xyz
Oct, 2020
语言模型如何帮助解决下游任务的数学探索
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks
HTML
PDF
Nikunj Saunshi, Sadhika Malladi, Sanjeev Arora
TL;DR
本文通过数学研究自回归语言模型预训练在下游任务中的应用,提出了将分类任务转化为句子填充任务的假设,证实表现好的语言模型可以是有意义的预训练任务,并给出了相应的数学形式化,同时通过分析认为语言模型可以有助于线性解决分类任务。
Abstract
autoregressive language models
pretrained on large corpora have been successful at solving
downstream tasks
, even with zero-shot usage. However, there is little theoretical justification for their success. This p
→