Breaking down the structure of long texts into semantically coherent segments makes the texts more readable and supports downstream applications like summarization and retrieval. Starting from an apparent link between text coherence and segmentation, we introduce a novel supervised model for text segmentation with simple but explicit coherence modeling. Our model -- a neural architecture consisting of two hierarchically connected Transformer networks -- is a multi-task learning model that couples the sentence-level segmentation objective with the coherence objective that differentiates correct sequences of sentences from corrupt ones. The proposed model, dubbed Coherence-Aware Text Segmentation (CATS), yields state-of-the-art segmentation performance on a collection of benchmark datasets. Furthermore, by coupling CATS with cross-lingual word embeddings, we demonstrate its effectiveness in zero-shot language transfer: it can successfully segment texts in languages unseen in training.

提出了一种新的监督式文本分段模型，它基于简单而明确的连贯性建模，并包括两个层次相连的 Transformer 网络的神经架构，它是一种多任务学习模型，并且通过将句子级分段目标与区分正确顺序的连贯性目标耦合来实现。该模型称为具有连贯性感知的文本分割（CATS），在一系列基准数据集上实现了最先进的分割性能，通过与跨语言词嵌入相结合，我们还展示了它在零-shot 语言转移方面的有效性：它可以成功地分割训练中未见过的语言中的文本。

双层Transformer和辅助一致性建模，提升文本分段