BriefGPT.xyz
Oct, 2022
当人类意见不一致时停止度量校准
Stop Measuring Calibration When Humans Disagree
HTML
PDF
Joris Baan, Wilker Aziz, Barbara Plank, Raquel Fernandez
TL;DR
在深度学习分类器中,通过在人类的多数意见中测量分类器的预测概率,可以评估分类器是否具备判断可信度的能力,尤其是在人类自身存在不同意见的情况下,通过类别频率、排序及熵等统计方法可以得到更为准确的评估。
Abstract
calibration
is a popular framework to evaluate whether a
classifier
knows when it does not know - i.e., its
predictive probabilities
are a
→