We present the joint contribution of Unbabel and Instituto Superior T\'ecnico to the WMT 2023 Shared Task on Quality Estimation (QE). Our team participated on all tasks: sentence- and word-level quality prediction (task 1) and fine-grained error span detection (task 2). For all tasks, we build on the COMETKIWI-22 model (Rei et al., 2022b). Our multilingual approaches are ranked first for all tasks, reaching state-of-the-art performance for quality estimation at word-, span- and sentence-level granularity. Compared to the previous state-of-the-art COMETKIWI-22, we show large improvements in correlation with human judgements (up to 10 Spearman points). Moreover, we surpass the second-best multilingual submission to the shared-task with up to 3.8 absolute points.

我们介绍了Unbabel和Instituto Superior Técnico在WMT 2023共享任务上对资料估计（QE）的联合贡献。我们的团队参与了所有任务：句子和单词水平的质量预测（任务1）以及精细错误跨度检测（任务2）。对于所有任务，我们基于COMETKIWI-22模型（Rei et al., 2022b）进行开发。我们的多语种方法在所有任务中排名第一，在单词、跨度和句子级别的质量估计方面达到了最新水平的性能。与以前的最新技术COMETKIWI-22相比，我们在与人类判断相关性方面取得了很大的改进（达到了10个Spearman分数）。此外，我们在共享任务中超过了第二好的多语种提交，达到了3.8个绝对分数。

COMETKIWI规模化：Unbabel-IST 2023的质量估计共享任务提交