BriefGPT.xyz
Feb, 2024
有效的Transformer是否真的节省计算量?
Do Efficient Transformers Really Save Computation?
HTML
PDF
Kai Yang, Jan Ackermann, Zhenyu He, Guhao Feng, Bohang Zhang...
TL;DR
我们研究了基于Transformer的语言模型,特别关注了Sparse Transformer和Linear Transformer的推理能力,并发现它们对一类动态规划问题更加有效。
Abstract
As
transformer-based language models
are trained on increasingly large datasets and with vast numbers of parameters, finding more efficient alternatives to the standard Transformer has become very valuable. While many
e
→