BriefGPT.xyz
May, 2022
OPT:开放预训练Transformer语言模型
OPT: Open Pre-trained Transformer Language Models
HTML
PDF
Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen...
TL;DR
我们呈现了Open Pre-trained Transformers (OPT),一个解码器型的预训练transformers套件,范围从125M到175B个参数,这些我们希望与感兴趣的研究人员完全和负责任地分享。 我们展示了OPT-175B与GPT-3相媲美,同时仅需要1/7的碳足迹进行开发。
Abstract
large language models
, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for
zero- and few-shot learning
. Given their computational cost, these models are diffi
→