BriefGPT.xyz
Apr, 2024
幻觉排行榜-量化大型语言模型中的幻觉
The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models
HTML
PDF
Giwon Hong, Aryo Pradipta Gema, Rohit Saxena, Xiaotang Du, Ping Nie...
TL;DR
该论文介绍了幻觉排行榜,一个旨在定量衡量和比较每个模型产生幻觉倾向的开放性倡议,通过一系列综合评估模型的基准测试,如准确性和忠实度等方面,涵盖了问答、摘要和阅读理解等不同任务,为研究人员和实践者指导选择最可靠的模型。
Abstract
large language models
(LLMs) have transformed the Natural Language Processing (NLP) landscape with their remarkable ability to understand and generate human-like text. However, these models are prone to ``
hallucinations
→