QuaCer-C: LLM中知识理解的定量认证

Feb, 2024

QuaCer-C: Quantitative Certification of Knowledge Comprehension in LLMs

Isha Chaudhary, Vedaant V. Jain, Gagandeep Singh

TL;DR提出了一种新的QuaCer-C证明框架来正式认证流行的LLMs的知识理解能力，通过高置信度的概率上界，证明LLMs在任何相关知识理解提示上给出正确答案的能力与参数数量的增加而提高，Mistral模型在这个评估中表现不佳。

Abstract

large language models (llms) have demonstrated impressive performance on several benchmarks. However, traditional studies do not provide formal guarantees on the performance of →