BriefGPT.xyz
Sep, 2020
不止尺寸重要:小型语言模型也是少样本学习者
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
HTML
PDF
Timo Schick, Hinrich Schütze
TL;DR
该研究通过将文本输入转换为包含任务描述的填空问题,并结合梯度优化和利用未标记数据,成功地创造了小型语言模型,达到了与GPT-3相似的性能,为小型语言模型的成功应用提供了关键因素。
Abstract
When scaled to hundreds of billions of parameters,
pretrained language models
such as
gpt-3
(Brown et al., 2020) achieve remarkable
few-shot perf
→