BriefGPT.xyz
Nov, 2023
超越大小:梯度如何塑造大型语言模型的剪枝决策
Beyond Size: How Gradients Shape Pruning Decisions in Large Language Models
HTML
PDF
Rocktim Jyoti Das, Liqun Ma, Zhiqiang Shen
TL;DR
预训练的大型语言模型的梯度为基础的模型修剪器(GBLM-Pruner)通过利用卡尔曼几何中的几何相互关联性明显胜过其他竞争对手,并在各种语言评估中超过了幅度修剪、Wanda和SparseGPT。
Abstract
large language models
(LLMs) with a billion or more parameters are prime targets for
network pruning
, which aims to reduce a portion of the network weights without compromising performance. Prior approaches such
→