BriefGPT.xyz
Aug, 2021
大语言模型的程序综合
Program Synthesis with Large Language Models
HTML
PDF
Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski...
TL;DR
本文探索了大型语言模型在通用编程语言的程序合成方面的局限性,并在新的基准测试中评估了这些模型的性能。作者在两个基准测试(MBPP和MathQA-Python)上测试了这些模型,结果表明这些模型的性能随着其大小的增加而呈现对数线性关系。他们研究了这些模型进行对话以及语意建模的能力,并发现即使是最好的模型也无法完全预测某些程序的输出。
Abstract
This paper explores the limits of the current generation of
large language models
for
program synthesis
in general purpose programming languages. We evaluate a collection of such models (with between 244M and 137
→