Oct, 2023
CLEX: 大型语言模型的连续长度外推
CLEX: Continuous Length Extrapolation for Large Language Models
Guanzheng Chen, Xin Li, Zaiqiao Meng, Shangsong Liang, Lidong Bing
TL;DR建议一种基于连续长度外推(CLEX)的 Transformer-based Large Language Models (LLMs),可将 context window 扩展到训练序列长度的 4 倍或 8 倍,并在实际任务中表现出竞争性性能。