BriefGPT.xyz
Aug, 2024
MAQA:评估大型语言模型在数据不确定性方面的量化不确定性
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty
HTML
PDF
Yongjin Yang, Haneul Yoo, Hwaran Lee
TL;DR
本研究解决了大型语言模型在数据不确定性下产生错误响应的问题,提出了一种新颖的多答案问答数据集MAQA以评估数据不确定性下的量化不确定性。此外,研究评估了五种不同模型的不确定性量化方法,发现熵和一致性方法在处理数据不确定性时表现良好,这为未来的不确定性量化研究指明了方向。
Abstract
Although
Large Language Models
(LLMs) are capable of performing various tasks, they still suffer from producing plausible but incorrect responses. To improve the reliability of LLMs, recent research has focused on
Uncer
→