BriefGPT.xyz
Jan, 2025
超越事实准确性:评估长文本生成中多样事实信息的覆盖
Beyond Factual Accuracy: Evaluating Coverage of Diverse Factual Information in Long-form Text Generation
HTML
PDF
Chris Samarinas, Alexander Krubner, Alireza Salemi, Youngwoo Kim, Hamed Zamani
TL;DR
本文提出了ICAT评估框架,用于衡量长文本生成中多样事实信息的覆盖程度。研究表明,该框架与人类判断高度相关,且通过模块化设计,能够适应不同领域和数据集,成为评估长文本生成质量的重要工具。
Abstract
This paper presents ICAT, an
Evaluation Framework
for measuring coverage of diverse
Factual Information
in long-form
Text Generation
. ICAT
→