深度强化学习在策略诱导攻击中的漏洞

Jan, 2017

深度强化学习在策略诱导攻击中的漏洞

Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks

Vahid Behzadan, Arslan Munir

TL;DR本文研究发现，基于深度强化学习的分类器同样存在容易受到篡改输入的对抗样本攻击，这导致了针对基于DQNs的策略诱导式攻击的出现。同时，我们验证了对抗性样本的可迁移性，提出了一种利用这种可迁移性的攻击机制，并通过对游戏学习场景的实验研究证明了其功效和影响。

Abstract

deep learning classifiers are known to be inherently vulnerable to manipulation by intentionally perturbed inputs, named adversarial examples. In this work, we establish that →