BriefGPT.xyz
Apr, 2021
RoFormer: 带旋转位置嵌入的增强Transformer
RoFormer: Enhanced Transformer with Rotary Position Embedding
HTML
PDF
Jianlin Su, Yu Lu, Shengfeng Pan, Bo Wen, Yunfeng Liu
TL;DR
本篇论文研究了在语言模型中如何整合位置信息,并提出了一种名为RoPE的方法,它可以将位置信息编码为旋转矩阵,并同时将显式的相对位置依赖性结合到自注意力公式中。实验结果表明,RoPE使transformer在处理长文本分类问题时表现出优越的性能。
Abstract
position encoding
in
transformer architecture
provides supervision for dependency modeling between elements at different positions in the sequence. We investigate various methods to encode positional information
→