BriefGPT.xyz
Jun, 2024
改变答案顺序可降低MMLU准确度
Changing Answer Order Can Decrease MMLU Accuracy
HTML
PDF
Vipul Gupta, David Pantoja, Candace Ross, Adina Williams, Megan Ung
TL;DR
通过对多个子任务的测试准确率进行评估,研究了大型语言模型在多项选择问答数据集上的可靠性,提出了调整排行榜测试标准的可能性。
Abstract
As
large language models
(LLMs) have grown in prevalence, particular benchmarks have become essential for the
evaluation
of these models and for understanding model capabilities. Most commonly, we use
→