BriefGPT.xyz
Apr, 2015
向量空间中多个嵌入每个单词的高效非参数估计
Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space
HTML
PDF
Arvind Neelakantan, Jeevan Shankar, Alexandre Passos, Andrew McCallum
TL;DR
提出一种扩展Skip-gram模型的方法,它可以高效地学习每个单词类型的多个嵌入,通过联合进行词义辨别和嵌入学习,非参数地估计每个单词类型的很多不同的词义,并通过在一个拥有近10亿标记的语料库上训练一台机器的演示,展示了它的可扩展性。
Abstract
There is rising interest in
vector-space
word embeddings
and their use in NLP, especially given recent methods for their fast estimation at very large scale. Nearly all this work, however, assumes a single vector
→