Large Language Models (LLMs) are increasingly used as powerful tools for several high-stakes natural language processing (NLP) applications. Recent prompting works claim to elicit intermediate reasoning steps and key tokens that serve as proxy explanations for LLM predictions. However, there is no certainty whether these explanations are reliable and reflect the LLMs behavior. In this work, we make one of the first attempts at quantifying the uncertainty in explanations of LLMs. To this end, we propose two novel metrics -- $\textit{Verbalized Uncertainty}$ and $\textit{Probing Uncertainty}$ -- to quantify the uncertainty of generated explanations. While verbalized uncertainty involves prompting the LLM to express its confidence in its explanations, probing uncertainty leverages sample and model perturbations as a means to quantify the uncertainty. Our empirical analysis of benchmark datasets reveals that verbalized uncertainty is not a reliable estimate of explanation confidence. Further, we show that the probing uncertainty estimates are correlated with the faithfulness of an explanation, with lower uncertainty corresponding to explanations with higher faithfulness. Our study provides insights into the challenges and opportunities of quantifying uncertainty in LLM explanations, contributing to the broader discussion of the trustworthiness of foundation models.

在这项研究中，我们尝试量化大型语言模型（LLM）解释的不确定性。为此，我们提出了两个新的度量标准——“口头化不确定性”和“探测不确定性”，用于量化生成解释的不确定性。我们的实证分析揭示了口头化不确定性不是可靠的解释置信度的估计，而探测不确定性的估计与解释的忠实度相关，较低的不确定性对应于较高的忠实度。这项研究为量化LLM解释的不确定性带来了洞察，有助于更广泛地探讨基础模型的可靠性。

大型语言模型的自然语言解释的不确定性量化