BriefGPT.xyz
Jan, 2021
预训练模型的红色警报:普遍存在神经元级反向门攻击漏洞
Red Alarm for Pre-trained Models: Universal Vulnerabilities by Neuron-Level Backdoor Attacks
HTML
PDF
Zhengyan Zhang, Guangxuan Xiao, Yongwei Li, Tian Lv, Fanchao Qi...
TL;DR
该研究探讨了预训练模型(PTMs)在细分任务中普适的漏洞性,称为神经元级后门攻击(NeuBA),并通过NLP和CV实验表明其可以通过模型修剪等防御方法来抵御。
Abstract
Due to the success of
pre-trained models
(PTMs), people usually fine-tune an existing PTM for downstream tasks. Most of PTMs are contributed and maintained by open sources and may suffer from
backdoor attacks
. In
→