Components of cyber physical systems, which affect real-world processes, are often exposed to the internet. Replacing conventional control methods with Deep Reinforcement Learning (DRL) in energy systems is an active area of research, as these systems become increasingly complex with the advent of renewable energy sources and the desire to improve their efficiency. Artificial Neural Networks (ANN) are vulnerable to specific perturbations of their inputs or features, called adversarial examples. These perturbations are difficult to detect when properly regularized, but have significant effects on the ANN's output. Because DRL uses ANN to map optimal actions to observations, they are similarly vulnerable to adversarial examples. This work proposes a novel attack technique for continuous control using Group Difference Logits loss with a bifurcation layer. By combining aspects of targeted and untargeted attacks, the attack significantly increases the impact compared to an untargeted attack, with drastically smaller distortions than an optimally targeted attack. We demonstrate the impacts of powerful gradient-based attacks in a realistic smart energy environment, show how the impacts change with different DRL agents and training procedures, and use statistical and time-series analysis to evaluate attacks' stealth. The results show that adversarial attacks can have significant impacts on DRL controllers, and constraining an attack's perturbations makes it difficult to detect. However, certain DRL architectures are far more robust, and robust training methods can further reduce the impact.

通过使用组别区别逻辑损失和分流层的新型攻击技术，可以在连续控制中显著增加攻击影响，并且所需的畸变要远小于最优目标攻击，从而使其更难被检测到。实验结果表明，对DRL控制器的对抗攻击会产生显著影响，同时限制攻击的扰动使其难以被检测，但某些DRL架构更具鲁棒性，并且鲁棒训练方法可以进一步减小攻击的影响。

A Novel Bifurcation Method for Observation Perturbation Attacks on
  Reinforcement Learning Agents: Load Altering Attacks on a Cyber Physical
  Power System

一种新的分叉方法用于对强化学习智能体的观测扰动攻击：对网络化电力系统的负载修改攻击