machine learning (ML) models are increasingly deployed in the wild to perform
a wide range of tasks. In this work, we ask to what extent can an adversary
steal functionality of such "victim" models based solely on blackbox
interactions: image in, predictions out. In contrast to prior w
对机器学习模型的黑盒攻击是可能的,即使它们的结构不同。通过生成对抗性样本,并利用受害者模型标记合成训练集,攻击者可以训练出自己的替代模型,并将对抗性样本转移到受害者模型中实施攻击,该方法可以使用新的技术使攻击过程更加有效率,在 Amazon 和 Google 等公司的商业机器学习分类系统中展示了攻击的有效性。