文本分割作为监督学习任务

Mar, 2018

Text Segmentation as a Supervised Learning Task

Omri Koshorek, Adir Cohen, Noam Mor, Michael Rotman, Jonathan Berant

TL;DR本研究利用维基百科文章，将文本分割任务作为有监督学习问题进行探究，提出了一个基于这个数据集的文本分割模型，并展示了其在未见过的自然文本上的泛化能力。

Abstract

text segmentation, the task of dividing a document into contiguous segments based on its semantic structure, is a longstanding challenge in language understanding. Previous work on text segmentation focused on un