BriefGPT.xyz
May, 2025
大语言模型训练的神经热力学定律
Neural Thermodynamic Laws for Large Language Model Training
HTML
PDF
Ziming Liu, Yizhou Liu, Jeff Gore, Max Tegmark
TL;DR
本研究针对当前大语言模型训练的理论空白,提出了神经热力学定律(NTL)这一新框架。通过对热力学量及经典热力学原理的分析,研究为学习率调度的设计提供了直观指导,具有重要的理论与实践意义。
Abstract
Beyond neural scaling laws, little is known about the laws underlying
Large Language Models
(LLMs). We introduce Neural Thermodynamic Laws (NTL) -- a new framework that offers fresh insights into LLM
Training Dynamics
→