BriefGPT.xyz
Jan, 2025
简单的测试时扩展
s1: Simple test-time scaling
HTML
PDF
Niklas Muennighoff, Zitong Yang, Weijia Shi, Xiang Lisa Li, Li Fei-Fei...
TL;DR
本文针对语言模型在测试时扩展的有效性进行了研究,提出了一种简单的方法来提高推理性能。通过构建小型数据集s1K和引入预算强制控制模型的思考过程,研究表明,经过监督微调的模型s1-32B在数学竞赛问题上表现超越了现有模型,且在没有干预的情况下,性能也得到了显著提升。
Abstract
Test-time scaling
is a promising new approach to
Language modeling
that uses extra test-time compute to improve performance. Recently, OpenAI's o1 model showed this capability but did not publicly share its metho
→