BriefGPT.xyz
Feb, 2025
MaskPrune:基于掩膜的层级均匀结构大语言模型剪枝
MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures
HTML
PDF
Jiayu Qin, Jianchao Tan, Kefeng Zhang, Xunliang Cai, Wei Wang
TL;DR
本研究针对大语言模型(LLM)日益增长的模型规模所带来的部署与推理挑战,提出了一种基于最小最大优化的掩膜学习新范式,以获得均匀的剪枝结构。该方法在保持高性能的同时,确保了剪枝模型结构的均匀性,超越了现有的最新技术水平。
Abstract
The remarkable performance of
Large Language Models
(LLMs) in various language tasks has attracted considerable attention. However, the ever-increasing size of these models presents growing challenges for deployment and inference.
→