BriefGPT.xyz
Jun, 2015
Ubuntu对话语料库:一份用于非结构化多轮对话系统研究的大型数据集
The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems
HTML
PDF
Ryan Lowe, Nissan Pow, Iulian Serban, Joelle Pineau
TL;DR
介绍Ubuntu对话语料库,包含近100万个多轮对话,可以用于建立基于神经语言模型的对话管理器,同时提供适用于此数据集的两种神经学习架构,并在选择最佳下一个响应的任务上提供了基准表现。
Abstract
This paper introduces the
ubuntu dialogue corpus
, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building
→