BriefGPT.xyz
Jul, 2020
针对热力图解释的对抗攻击的简单防御
A simple defense against adversarial attacks on heatmap explanations
HTML
PDF
Laura Rieger, Lars Kai Hansen
TL;DR
通过多种解释方法的聚合,我们提供了一种有效的方法来防御神经网络上的对抗性攻击,使其对于潜在攻击变得更加稳健。
Abstract
With
machine learning
models being used for more sensitive applications, we rely on
interpretability methods
to prove that no discriminating attributes were used for classification. A potential concern is the so-
→