Confidence estimation aiming to evaluate output trustability is crucial for the application of large language models (LLM), especially the black-box ones. Existing confidence estimation of LLM is typically not calibrated due to the overconfidence of LLM on its generated incorrect answers. Existing approaches addressing the overconfidence issue are hindered by a significant limitation that they merely consider the confidence of one answer generated by LLM. To tackle this limitation, we propose a novel paradigm that thoroughly evaluates the trustability of multiple candidate answers to mitigate the overconfidence on incorrect answers. Building upon this paradigm, we introduce a two-step framework, which firstly instructs LLM to reflect and provide justifications for each answer, and then aggregates the justifications for comprehensive confidence estimation. This framework can be integrated with existing confidence estimation approaches for superior calibration. Experimental results on six datasets of three tasks demonstrate the rationality and effectiveness of the proposed framework.

针对大型语言模型（LLM），特别是黑盒模型的应用，评估输出可信度的置信度估计是关键。现有的LLM置信度估计通常因LLM对生成的错误答案过于自信而缺乏校准。现有方法解决过度自信问题的能力受到一个重要限制，即它们仅考虑LLM生成的一个答案的置信度。为了解决这个限制，我们提出了一种新的范式，全面评估多个候选答案的可信度以减轻对错误答案的过度自信。基于这个范式，我们引入了一个两步框架，首先指导LLM反思并提供每个答案的理由，然后汇集这些理由进行全面的置信度估计。这个框架可以与现有的置信度估计方法结合，实现更好的校准。对三个任务的六个数据集的实验证明了所提框架的合理性和有效性。

大型语言模型的置信度估计：基于多个答案反思之前三思