BriefGPT.xyz
Apr, 2024
PairEval:使用配对比较进行开放域对话评价
PairEval: Open-domain Dialogue Evaluation with Pairwise Comparison
HTML
PDF
ChaeHun Park, Minseok Choi, Dohyun Lee, Jaegul Choo
TL;DR
提出了一种基于对话响应之间的比较评估的对话评估度量方法PairEval,该度量方法比基准度量方法更具鲁棒性,并且与人类判断的相关性更高。
Abstract
Building a reliable and automated
evaluation metric
is a necessary but challenging problem for open-domain
dialogue systems
. Recent studies proposed evaluation metrics that assess generated responses by consideri
→