BriefGPT.xyz
Feb, 2024
用反事实表示解释文本分类器
Explaining Text Classifiers with Counterfactual Representations
HTML
PDF
Pirmin Lemberger, Antoine Saillenfest
TL;DR
通过在文本表示空间进行干预的简单方法生成对抗事实,以用于分类器解释和偏见缓解。
Abstract
One well motivated explanation method for
classifiers
leverages
counterfactuals
which are hypothetical events identical to real observations in all aspects except for one categorical feature. Constructing such co
→