Oct, 2023
超越准确性:使用IdentityChain评估编程领域的大型语言模型的自一致性
Beyond Accuracy: Evaluating Self-Consistency of Code Large Language
Models with IdentityChain
TL;DRCode Large Language Models (Code LLMs) are evaluated for their self-consistency and general accuracy using the IdentityChain framework, which exposes weaknesses in current models and serves as a model debugging tool.