Calibrating language models (LMs) aligns their generation confidence with the actual likelihood of answer correctness, which can inform users about LMs' reliability and mitigate hallucinated content. However, prior calibration methods, such as self-consistency-based and logit-based approaches, are either limited in inference-time efficiency or fall short of providing informative signals. Moreover, simply filtering out low-confidence responses reduces the LM's helpfulness when the answers are correct. Therefore, effectively using calibration techniques to enhance an LM's factuality remains an unsolved challenge. In this paper, we first propose an activation-based calibration method, ActCab, which trains a linear layer on top of the LM's last-layer activations that can better capture the representations of knowledge. Built on top of ActCab, we further propose CoDec, a confidence-guided decoding strategy to elicit truthful answers with high confidence from LMs. By evaluating on five popular QA benchmarks, ActCab achieves superior calibration performance than all competitive baselines, e.g., by reducing the average expected calibration error (ECE) score by up to 39%. Further experiments on CoDec show consistent improvements in several LMs' factuality on challenging QA datasets, such as TruthfulQA, highlighting the value of confidence signals in enhancing factuality.

在本文中，我们首先提出一种基于激活的校准方法 ActCab，它在语言模型的最后一层激活上训练一个线性层，能更好地捕捉知识的表征。在 ActCab 的基础上，我们进一步提出了一种以置信度为指导的解码策略 CoDec，以从语言模型中得到置信度高的真实答案。通过在五个热门问答基准上进行评估，ActCab 在校准性能方面优于所有竞争基准，例如平均期望校准误差减少了最高39%。进一步对 CoDec 进行的实验证明，在挑战性问答数据集（如 TruthfulQA）上提升了几个语言模型的真实性，突显了置信度信号在增强真实性方面的价值。

通过基于激活的置信度校准和引导解码增强语言模型的真实性