BriefGPT.xyz
Aug, 2023
增强抗毒能力:针对中毒攻击的改进点对点认证
Enhancing the Antidote: Improved Pointwise Certifications against Poisoning Attacks
HTML
PDF
Shijie Liu, Andrew C. Cullen, Paul Montague, Sarah M. Erfani, Benjamin I. P. Rubinstein
TL;DR
通过利用差分隐私和采样高斯机制,我们的模型对有限数量的有毒样本提供了确保每个测试实例预测不变的保证,从而提供了超过以前认证提供的两倍以上的对抗鲁棒性。
Abstract
poisoning attacks
can disproportionately influence model behaviour by making small changes to the training corpus. While
defences
against specific
→