BriefGPT.xyz
Sep, 2023
大型语言模型是否能理解真实世界的复杂指令?
Can Large Language Models Understand Real-World Complex Instructions?
HTML
PDF
Qianyu He, Jie Zeng, Wenhao Huang, Lina Chen, Jin Xiao...
TL;DR
通过广泛的实验证明,我们提出了CELLO——一个评估大型语言模型理解复杂指令能力的基准,包括八个复杂指令特征,并从现实场景中构建了一个全面的评估数据集。我们还建立了四个标准和相应的度量方法,以比较代表性的面向中文和面向英文模型在跟随复杂指令方面的表现。
Abstract
large language models
(LLMs) can understand human instructions, showing their potential for pragmatic applications beyond traditional NLP tasks. However, they still struggle with
complex instructions
, which can b
→