BriefGPT.xyz
Dec, 2021
PECO:通过聚类离群值的渐进式评估来检查自然语言推理数据集中的单句标签泄漏
Automatically Identifying Semantic Bias in Crowdsourced Natural Language Inference Datasets
HTML
PDF
Michael Saxon, Xinyi Wang, William Yang Wang
TL;DR
本文介绍了一种基于模型的技术PECO,用于识别自然语言推理数据集中单句标签泄漏问题和子群体。通过分析现有数据集,表明单句标签泄漏问题仍然普遍存在于当今自然语言推理评估任务中。
Abstract
natural language inference
(NLI) is an important task for producing useful models of human language. Unfortunately large-scale NLI
dataset
production relies on crowdworkers who are prone to introduce
→