BriefGPT.xyz
Jun, 2024
BlockLLM: 通过选择和优化正确的块坐标来实现LLM的高效适应
BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and Optimizing the Right Coordinate Blocks
HTML
PDF
Amrutha Varshini Ramesh, Vignesh Ganapathiraman, Issam H. Laradji, Mark Schmidt
TL;DR
使用BlockLLM方法选择和更新可训练参数的一个很小子集,从而在不改变模型架构和训练过程的情况下,减少底层优化过程的内存占用并在GLUE基准测试中实现了最先进的困惑度得分。
Abstract
Training
large language models
(LLMs) for pretraining or adapting to new tasks and domains has become increasingly critical as their applications expand. However, as the model and the data sizes grow, the training process presents significant
→