ACLMay, 2023
不需重新训练,只需改写:通过改写文本来抵御对抗样本
Don't Retrain, Just Rewrite: Countering Adversarial Perturbations by Rewriting Text
Ashim Gupta, Carter Wood Blum, Temma Choji, Yingjie Fei, Shalin Shah...
TL;DRATINTER 是一种模型,可以截获和学习重写对下游文本分类器产生对抗性影响的输入,有效提供更好的对抗性鲁棒性。