BriefGPT.xyz
Oct, 2023
抑制因子:ReLU和基于加法的注意力用于高效Transformer
The Inhibitor: ReLU and Addition-Based Attention for Efficient Transformers
HTML
PDF
Rickard Brännvall
TL;DR
通过替代点积和基于Softmax的注意力机制,将其替换为仅包含加法和ReLU激活的替代机制,以提高量化Transformer的计算效率,并支持在资源受限的硬件或同态加密等替代算术系统上运行更大规模的量化Transformer模型。
Abstract
To enhance the
computational efficiency
of
quantized transformers
, we replace the dot-product and Softmax-based attention with an alternative mechanism involving addition and ReLU activation only. This side-steps
→