BriefGPT.xyz
Nov, 2022
通过局部梯度对齐实现更强健的解释性
Towards More Robust Interpretation via Local Gradient Alignment
HTML
PDF
Sunghwan Joo, Seokhyeon Jeong, Juyeon Heo, Adrian Weller, Taesup Moon
TL;DR
本文提出新的思路,通过特征归因归一化改进局部梯度,提出了L2范数和余弦距离的规范化不变的损失函数作为正则化项,在CIFAR-10和ImageNet-100上实验表明该方法大大提高了解释的鲁棒性。
Abstract
neural network interpretation
methods, particularly
feature attribution methods
, are known to be fragile with respect to adversarial input perturbations. To address this, several methods for enhancing the local s
→