TL;DR本文介绍 MAYA - 一种多粒度攻击的方法,它能够有效生成高质量的对抗样本,并通过基于强化学习的方法训练多粒度攻击代理以进一步减少查询次数和适应黑盒模型攻击。在两种不同的黑盒攻击设置和三种基准数据集上攻击 BiLSTM、BERT 和 RoBERTa 模型,实验结果表明我们的模型具有更好的攻击性能和更流畅、更符合语法规则的对抗样本。
Abstract
Recently, the textual adversarial attack models become increasingly popular due to their successful in estimating the robustness of NLP models. However, existing works have obvious deficiencies. (1) They usually consider only a single granularity of modification strategies (e.g. word-l