BriefGPT.xyz
Aug, 2024
机器翻译元评估的守护者:哨兵指标
Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!
HTML
PDF
Stefano Perrella, Lorenzo Proietti, Alessandro Scirè, Edoardo Barba, Roberto Navigli
TL;DR
本研究针对现有机器翻译元评估框架的问题,提出了哨兵指标来评估其准确性、稳健性和公平性。通过引入哨兵指标,研究揭示了当前评估框架对特定类别指标的偏好,并指出了最新指标可能基于训练数据中的虚假相关性进行评估的隐忧。
Abstract
Annually, at the Conference of
Machine Translation
(WMT), the
Metrics
Shared Task organizers conduct the
Meta-evaluation
of
→