Basel Mousi, Nadir Durrani, Fahim Dalvi, Majd Hawasly, Ahmed Abdelali
TL;DR利用聚类方法探索多语言模型中的潜在概念,研究多语言嵌入之间的对齐和重叠程度,通过引入两个度量指标 CA 和 CO 进行定量分析,发现网络的深层对齐性较好,模型的微调增强了潜在空间中的对齐性,任务特定的校准有助于解释模型的零射击能力的出现。
Abstract
Despite their remarkable ability to capture linguistic nuances across diverse languages, questions persist regarding the degree of alignment between languages in multilingual embeddings. Drawing inspiration from