非监督式识别翻译语言特征

Sep, 2016

Unsupervised Identification of Translationese

Ella Rabinovich, Shuly Wintner

TL;DR本研究展示了基于无监督聚类的跨领域原文本和翻译文本分类的高精度方法，并提出了一种用于跨领域数据集的简单聚类方法，以及一种确定聚类结果正确标签的方法。

Abstract

translated texts are distinctively different from original ones, to the extent that supervised text classification methods can distinguish between them with high accuracy. These differences were proven useful for