BriefGPT.xyz
Jan, 2025
扩展高效推理语言模型
Scaling Inference-Efficient Language Models
HTML
PDF
Song Bian, Minghao Yan, Shivaram Venkataraman
TL;DR
本研究针对现有扩展法未能考虑推理成本的问题,提出了通过修改Chinchilla扩展法来共同优化模型参数数量、训练标记数量和模型结构的新方法。通过对63个不同模型的广泛实证研究,我们推出了Morph-1B模型,该模型在保证下游任务准确性的同时,提高了1.8倍的推理延迟效率。
Abstract
Scaling Laws
are powerful tools to predict the performance of large language models. However, current
Scaling Laws
fall short of accounting for inference costs. In this work, we first show that
→