BriefGPT.xyz
Jan, 2024
E^2-LLM:大型语言模型的高效和极端长度扩展
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
HTML
PDF
Jiaheng Liu, Zhiqi Bai, Yuanxing Zhang, Chenchen Zhang, Yu Zhang...
TL;DR
我们提出了一种名为E2-LLM的高效和极长扩展的大型语言模型方法,通过减少计算成本并对不同样本进行增强方法来在推理时支持任意上下文长度,实验结果表明其在具有挑战性的长上下文任务中的有效性。
Abstract
Typically,
training llms
with long context sizes is computationally expensive, requiring extensive training hours and GPU resources. Existing
long-context extension methods
usually need additional training proced
→