BriefGPT.xyz
Feb, 2018
增强Ubuntu对话语料库中未登录词的单词表示
Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus
HTML
PDF
Jianxiong Dong, Jim Huang
TL;DR
该论文提出了一种方法,将通用的预训练词向量与特定任务的训练集生成的词向量相结合,解决Ubuntu对话语料库中的大量生僻词问题,应用于下一个话语选择任务中,结果表明该方法优于原始方法ESIM,并在Ubuntu和Douban对话语料库上取得了最先进的结果。
Abstract
ubuntu dialogue corpus
is the largest public available dialogue corpus to make it feasible to build end-to-end
deep neural network
models directly from the conversation data. One challenge of
→