We study the problem of reward poisoning attacks against general offline reinforcement learning with deep neural networks for function approximation. We consider a black-box threat model where the attacker is completely oblivious to the learning algorithm and its budget is limited by constraining both the amount of corruption at each data point, and the total perturbation. We propose an attack strategy called `policy contrast attack'. The high-level idea is to make some low-performing policies appear as high-performing while making high-performing policies appear as low-performing. To the best of our knowledge, we propose the first black-box reward poisoning attack in the general offline RL setting. We provide theoretical insights on the attack design and empirically show that our attack is efficient against current state-of-the-art offline RL algorithms in different kinds of learning datasets.

我们研究了利用深度神经网络进行函数逼近的一般离线强化学习中奖励污染攻击问题。我们提出了一种名为`策略对比攻击`的攻击策略，通过使一些低性能策略看起来像高性能策略，同时使高性能策略看起来像低性能策略来进行攻击。据我们所知，这是首个在一般离线强化学习环境中提出的黑盒奖励污染攻击。我们在攻击设计上提供了理论洞察，并通过在不同类型的学习数据集上实证表明我们的攻击对当前最先进的离线强化学习算法有效。

离线强化学习中的奖励污染攻击