BriefGPT.xyz
Oct, 2023
解锁可预测的增长能力
Unlock Predictable Scaling from Emergent Abilities
HTML
PDF
Shengding Hu, Xin Liu, Xu Han, Xinrong Zhang, Chaoqun He...
TL;DR
通过大规模采样在解码阶段引入 PassUntil 评估策略,本研究量化了任务性能的扩展规律并发现了突现能力的具体证据,从而推翻了有关突现能力产生的常见“多步推理假设”,提出了一种符合观察到的扩展曲线的新假设。
Abstract
The scientific scale-up of large language models (LLMs) necessitates a comprehensive understanding of their
scaling properties
. However, the existing literature on the
scaling properties
only yields an incomplete
→