BriefGPT.xyz
Mar, 2022
数据异常:人工智能系统中可疑数据的类别、原因、后果和检测
Data Smells: Categories, Causes and Consequences, and Detection of Suspicious Data in AI-based Systems
HTML
PDF
Harald Foidl, Michael Felderer, Rudolf Ramler
TL;DR
提出了数据气味的概念,即潜在、不明显的数据质量问题,分为可信度气味、可理解性气味、一致性气味,介绍了检测数据气味的工具支持,并在超过240个真实数据集上进行了初步的气味检测。
Abstract
High
data quality
is fundamental for today's
ai-based systems
. However, although
data quality
has been an object of research for decades,
→