BriefGPT.xyz
Jun, 2024
常见与领域特定提示对基础性大语言模型有效性的评估
How Good Is It? Evaluating the Efficacy of Common versus Domain-Specific Prompts on Foundational Large Language Models
HTML
PDF
Oluyemi Enoch Amujo, Shanchieh Jay Yang
TL;DR
该研究评估了大型语言模型在常见查询和特定领域查询下的表现,并强调了综合评估框架在多领域人工智能研究中提高基准测试程序可靠性的重要性。
Abstract
Recently,
large language models
(
llms
) have expanded into various domains. However, there remains a need to evaluate how these models perform when prompted with commonplace queries compared to
→