BriefGPT.xyz
Jun, 2016
大词汇量网络高效分布式Word2vec训练系统
Network-Efficient Distributed Word2vec Training System for Large Vocabularies
HTML
PDF
Erik Ordentlich, Lee Yang, Andy Feng, Peter Cnudde, Mihajlo Grbovic...
TL;DR
本文介绍了一种基于分布式并行训练的新型Word2vec算法,可以有效训练具有数亿个单词的大词汇量语料库的词向量表示,而不需要大量数据传输或单个服务器的存储。经实验证明,在Gemini广告投放平台实践中取得了显著的业务贡献。
Abstract
word2vec
is a popular family of algorithms for unsupervised training of dense
vector representations
of words on large text corpuses. The resulting vectors have been shown to capture semantic relationships among
→