卷积与自注意力：重新解释预训练语言模型中的相对位置

Jun, 2021

Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models

Tyler A. Chang, Yifan Xu, Weijian Xu, Zhuowen Tu

TL;DR该研究探讨了卷积和自注意力在自然语言任务中的关系，提出了一种将卷积融合到自注意力中的方法，并使用BERT在多个下游任务上验证了卷积相对于绝对位置嵌入的性能优势。

Abstract

In this paper, we detail the relationship between convolutions and self-attention in natural language tasks. We show that relative position embed