TL;DR本文介绍了一种基于🈚️ground truth summaries 的情况下,利用文档创建合成数据集、引入多种噪声生成函数以及学习生成原始评论的摘要模型的方法,该方法比抽取式和生成式基线模型都有更好的效果。
Abstract
The supervised training of high-capacity models on large datasets containing
hundreds of thousands of document-summary pairs is critical to the recent
success of deep learning techniques for abstractive summarization.
Unfortunately, in most domains (other than news) such training data