BriefGPT.xyz
Sep, 2021
使用有效的评估数据集对常识知识库填充进行基准测试
Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation Dataset
HTML
PDF
Tianqing Fang, Weiqi Wang, Sehyun Choi, Shibo Hao, Hongming Zhang...
TL;DR
本论文提出了一个新的大规模的数据集来评估神经模型在通识推理任务中的表现,并提出了一种基于图形的归纳式通识推理模型。实验结果表明,推广通识推理任务是一项困难的任务,训练时表现出高准确性的模型在评估集上表现不佳,与人类表现之间存在巨大的差距。
Abstract
Reasoning over
commonsense knowledge bases
(CSKB) whose elements are in the form of free-text is an important yet hard task in NLP. While
cskb completion
only fills the missing links within the domain of the CSKB
→