As large language models continue to be widely developed, robust uncertainty quantification techniques will become crucial for their safe deployment in high-stakes scenarios. In this work, we explore how conformal prediction can be used to provide uncertainty quantification in language models for the specific task of multiple-choice question-answering. We find that the uncertainty estimates from conformal prediction are tightly correlated with prediction accuracy. This observation can be useful for downstream applications such as selective classification and filtering out low-quality predictions. We also investigate the exchangeability assumption required by conformal prediction to out-of-subject questions, which may be a more realistic scenario for many practical applications. Our work contributes towards more trustworthy and reliable usage of large language models in safety-critical situations, where robust guarantees of error rate are required.

探讨如何利用符合性预测方法对大型语言模型进行不确定性量化，以提高其在多选题答题等任务中的可靠性及稳定性。研究发现，符合性预测所估计的不确定性与模型的预测准确性存在密切关联，这一发现可以用于选择性分类及过滤低质量预测结果等下游应用。研究还探讨了符合性预测对于超出问题领域的问题的处理方法。本工作旨在为大型语言模型在安全关键性场景中提供更加可信和可靠的使用保障。

使用大型语言模型的共形预测在多项选择题回答中的应用