BriefGPT.xyz
Jun, 2024
RAGBench: 用于检索增强生成系统的可解释性基准测试
RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems
HTML
PDF
Robert Friel, Masha Belyi, Atindriyo Sanyal
TL;DR
RAGBench是首个包含10万个有标签RAG系统实例的综合型大规模评估基准数据集,覆盖了五个独特的行业特定领域和各种RAG任务类型,并引入了可解释和可操作的RAG评估指标集TRACe。
Abstract
retrieval-augmented generation
(RAG) has become a standard architectural pattern for incorporating domain-specific knowledge into user-facing chat applications powered by Large Language Models (LLMs).
rag systems
→