BriefGPT.xyz
Aug, 2022
MENLI: 自然语言推理的鲁棒性评估度量
MENLI: Robust Evaluation Metrics from Natural Language Inference
HTML
PDF
Yanran Chen, Steffen Eger
TL;DR
本文提出基于自然语言推断方法的评价指标,相比以往BERT-based评价指标更具鲁棒性,并结合其他评价指标可以同时提高鲁棒性和质量指标。
Abstract
Recently proposed
bert-based evaluation metrics
perform well on standard evaluation benchmarks but are vulnerable to
adversarial attacks
, e.g., relating to factuality errors. We argue that this stems (in part) fr
→