Duc N. M Hoang, Minsik Cho, Thomas Merth, Mohammad Rastegari, Zhangyang Wang
TL;DR大型语言模型,困惑度,压缩,基于提示的恢复,推理时动态提示。
Abstract
large language models (LLMs), while transformative for NLP, come with significant computational demands, underlining the need for efficient, training-free compression. Notably, the reliability of →