BriefGPT.xyz
Oct, 2024
面向欧洲语言的跨语言大型语言模型评估
Towards Cross-Lingual LLM Evaluation for European Languages
HTML
PDF
Klaudia Thellmann, Bernhard Stadler, Michael Fromm, Jasper Schulze Buschhoff, Alex Jude...
TL;DR
本研究解决了在多种欧洲语言中对大型语言模型(LLM)进行一致且有意义评估的挑战,尤其是多语种基准稀缺的问题。我们提出了一种针对欧洲语言的跨语言评估方法,利用翻译的五个广泛使用的基准测试评估40个LLM在21种欧洲语言中的能力,创建了新的多语种评估框架和数据集,从而推动了多语种LLM评估的进一步研究。
Abstract
The rise of
Large Language Models
(LLMs) has revolutionized natural language processing across numerous languages and tasks. However, evaluating LLM performance in a consistent and meaningful way across multiple
Europea
→