BriefGPT.xyz
Mar, 2024
NovelQA: 一个长距离小说问答基准
NovelQA: A Benchmark for Long-Range Novel Question Answering
HTML
PDF
Cunxiang Wang, Ruoxi Ning, Boqi Pan, Tonghui Wu, Qipeng Guo...
TL;DR
使用英文小说构建的NovelQA评估长上下文具有深层文本理解能力的LLMs的表现,结果强调了LLMs在多次推理、注重细节的问题和超过100,000个标记的极长输入方面面临的挑战,强调了进一步改进LLMs以提高其长上下文理解和计算文学研究的必要性。
Abstract
The rapid advancement of
large language models
(
llms
) has introduced a new frontier in natural language processing, particularly in understanding and processing long-context information. However, the
→