BriefGPT.xyz
Jan, 2020
在评估可解释 AI 系统时,代理任务和主观度量可能会误导
Proxy Tasks and Subjective Measures Can Be Misleading in Evaluating Explainable AI Systems
HTML
PDF
Zana Buçinca, Phoebe Lin, Krzysztof Z. Gajos, Elena L. Glassman
TL;DR
本研究通过在线实验和现场思考研究评估了两种当前常用的XAI系统评估技术,并发现代理任务和主观度量在实际决策任务中均未能预测评估结果,这表明当前的评估方法可能错误地拖慢了我们开发可靠执行出色的人工智能与人类团队的进步。
Abstract
Explainable artificially intelligent (XAI) systems form part of sociotechnical systems, e.g., human+AI teams tasked with making decisions. Yet, current XAI systems are rarely evaluated by measuring the
performance
of human+AI teams on actual
→