多语言BERT嵌入空间中的各向同性分析

Oct, 2021

多语言BERT嵌入空间中的各向同性分析

An Isotropy Analysis in the Multilingual BERT Embedding Space

Sara Rajaee, Mohammad Taher Pilehvar

TL;DR研究探讨如何解决多语种 BERT 模型在语言表示中的异构性以及异常维度，以提高其表现力和性能，并发现各种语言的嵌入空间在结构上部分类似。

Abstract

Several studies have explored various advantages of multilingual pre-trained models (e.g., multilingual BERT) in capturing shared linguistic knowledge. However, their limitations have not been paid enough attention. In this paper, we investigate the representation degeneration problem