BriefGPT.xyz
Feb, 2022
长程变压器的自然语言处理任务效率
The NLP Task Effectiveness of Long-Range Transformers
HTML
PDF
Guanghui Qin, Yukun Feng, Benjamin Van Durme
TL;DR
比较研究了多种Transformer模型的性能,发现长序列的改进版本在内容选择和查询引导解码方面有优势,但在处理远距离的信息和近似误差上有欠缺的地方。
Abstract
transformer models
cannot easily scale to long sequences due to their O(N^2) time and space complexity. This has led to Transformer variants seeking to lessen computational complexity, such as
longformer
and
→