\textbf{P}re-\textbf{T}rained \textbf{M}odel\textbf{s} have been widely applied and recently proved vulnerable under backdoor attacks: the released pre-trained weights can be maliciously poisoned with certain triggers. When the triggers are activated, even the fine-tuned model will predict pre-defined labels, causing a security threat. These backdoors generated by the poisoning methods can be erased by changing hyper-parameters during fine-tuning or detected by finding the triggers. In this paper, we propose a stronger weight-poisoning attack method that introduces a layerwise weight poisoning strategy to plant deeper backdoors; we also introduce a combinatorial trigger that cannot be easily detected. The experiments on text classification tasks show that previous defense methods cannot resist our weight-poisoning method, which indicates that our method can be widely applied and may provide hints for future model robustness studies.

本文提出一种更强的权重污染攻击方法，引入逐层权重污染策略以种植更深层次的后门；我们还引入一种组合式触发器，不能轻易检测。实验表明，以前的防御方法无法抵抗我们的权重污染方法，这表明我们的方法可以被广泛应用，并为未来的模型鲁棒性研究提供线索。

通过逐层权值污染对预训练模型进行后门攻击