Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan...
TL;DR在大规模预训练语言模型上进行fine-tuning可以显著提高模型在 NLP 任务中的任务值线表现,同时还证明了 scaling up 语言模型可以大大改善任务独立的few-shot learning表现,并探讨了GPT-3模型优势和局限性。
Abstract
Recent work has demonstrated substantial gains on many nlp tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture,