BriefGPT.xyz
Sep, 2021
共享嵌入空间中跨语言性的大规模多语言分析
A Massively Multilingual Analysis of Cross-linguality in Shared Embedding Space
HTML
PDF
Alex Jones, William Yang Wang, Kyle Mahowald
TL;DR
本文研究了跨语言模型中影响句子级别对齐的语言和非语言因素,并使用BERT和BiLSTM模型和《圣经》作为语料库进行了比较分析,结果表明,词序一致性和形态复杂度一致性是跨语言性的两个最强的语言预测因素。
Abstract
In
cross-lingual language models
, representations for many different languages live in the same space. Here, we investigate the linguistic and non-linguistic factors affecting sentence-level
alignment
in cross-li
→