丰富预训练神经词语分割

Apr, 2017

Neural Word Segmentation with Rich Pretraining

Jie Yang, Yue Zhang, Fei Dong

TL;DR通过使用更丰富的外部信息来预先训练模块化分词模型，神经词分割的效果得到了显著的提高，在六项基准测试中达到了竞争最佳方法的准确度。

Abstract

neural word segmentation research has benefited from large-scale raw texts by leveraging them for pretraining character and word embeddings. On the other hand, statistical segmentation research has exploited rich