BriefGPT.xyz
Jul, 2024
MINI-LLM: 大语言模型的内存高效的结构化剪枝
MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models
HTML
PDF
Hongrong Cheng, Miao Zhang, Javen Qinfeng Shi
TL;DR
本文提出了一种Memory-effIcieNt结构化剪枝方法(MINI-LLM),通过整合大小、激活和梯度等多个指标,利用特征图敏感性进行剪枝,从而有效地降低了GPU内存的占用,并在多个下游任务上展现了优异的性能。
Abstract
As
large language models
(
llms
) grow dramatically in size, there is an increasing trend in compressing and speeding up these models. Previous studies have highlighted the usefulness of
→