BriefGPT.xyz
Jun, 2021
评估具有分类模块化的词嵌入
Evaluating Word Embeddings with Categorical Modularity
HTML
PDF
Sílvia Casacuberta, Karina Halevy, Damián E. Blasi
TL;DR
本文介绍了一种新的低资源内在度量标准称为 categorical modularity,用于评估单词嵌入模型的质量。作者使用具有神经生物学意义的59个语义类别的500个核心词语,在29种语言中分析了三种单词嵌入模型,提出 categorical modularity 与单、跨语言任务性能之间存在中等到强的正相关性。
Abstract
We introduce
categorical modularity
, a novel low-resource intrinsic metric to evaluate word embedding quality.
categorical modularity
is a graph modularity metric based on the $k$-nearest neighbor graph construct
→