BriefGPT.xyz
Sep, 2021
资源有限条件下的活动标签清洗,以提高数据集质量
Active label cleaning: Improving dataset quality under resource constraints
HTML
PDF
Melanie Bernhardt, Daniel C. Castro, Ryutaro Tanno, Anton Schwaighofer, Kerem C. Tezcan...
TL;DR
本文提出一种基于数据驱动的主动标签清理方法来解决数据注释中的标签噪音问题,通过对样本进行优先级排序,提高数据集质量,具有较好的可行性和高效性。
Abstract
Imperfections in
data annotation
, known as
label noise
, are detrimental to the training of
machine learning
models and have an often-overl
→