BriefGPT.xyz
Jun, 2024
全局完善:大型语言模型上的标记级校准度量
Full-ECE: A Metric For Token-level Calibration on Large Language Models
HTML
PDF
Han Liu, Yupeng Zhang, Bingning Wang, Weipeng Chen, Xiaolin Hu
TL;DR
深度神经网络和大型语言模型在提供准确的不确定性估计方面面临挑战,因此提出一种新的校准概念——完全校准,并引入其相应的度量指标Full-ECE,用于评估预测概率分布的整体校准性。
Abstract
deep neural networks
(DNNs) excel in various domains but face challenges in providing accurate
uncertainty estimates
, which are crucial for high-stakes applications.
→