BriefGPT.xyz
Jan, 2025
TreeKV:基于树结构的平滑键值缓存压缩
TreeKV: Smooth Key-Value Cache Compression with Tree Structures
HTML
PDF
Ziwei He, Jian Yuan, Haoli Bai, Jingwen Leng, Bo Jiang
TL;DR
本研究解决了在长序列和资源有限环境中,现有键值缓存压缩方法在信息保留上的不足,提出了TreeKV这一直观且无需训练的方法。通过树结构实现平滑缓存压缩,TreeKV在语言建模任务中表现优异,相比基线模型在较长上下文中的应用展现出显著的性能提升,达到最佳效率只需6%的预算。
Abstract
Efficient key-value (KV) cache
Compression
is critical for scaling transformer-based Large
Language Models
(LLMs) in long sequences and resource-limited settings. Existing methods evict tokens based on their posi
→