Deep Reinforcement Learning (DRL) policies have been shown to be vulnerable
to small adversarial noise in observations. Such adversarial noise can have
disastrous consequences in safety-critical environments. For instance, a
self-driving car receiving adversarially perturbed sensory observations about
nearby signs (e.g., a stop sign physically altered to be perceived as a speed
limit sign) or objects (e.g., cars altered to be recognized as trees) can be
fatal. Existing approaches for making RL algorithms robust to an
observation-perturbing adversary have focused on reactive approaches that
iteratively improve against adversarial examples generated at each iteration.
While such approaches have been shown to provide improvements over regular RL
methods, they are reactive and can fare significantly worse if certain
categories of adversarial examples are not generated during training. To that
end, we pursue a more proactive approach that relies on directly optimizing a
well-studied robustness measure, regret instead of expected value. We provide a
principled approach that minimizes maximum regret over a "neighborhood" of
observations to the received "observation". Our regret criterion can be used to
modify existing value- and policy-based Deep RL methods. We demonstrate that
our approaches provide a significant improvement in performance across a wide
variety of benchmarks against leading approaches for robust Deep RL.

该论文提出一种更为积极的方法改进深度强化学习中的强健性，采用最小化最大后悔作为优化方法，并证明该方法可显著提高性能。

基于遗憾的优化方法用于强化学习的鲁棒性

Regret-Based Optimization for Robust Reinforcement Learning

Bayesian inference and Gaussian processes are widely used in applications
ranging from robotics and control to biological systems. Many of these
applications are safety-critical and require a characterization of the
uncertainty associated with the learning model and formal guarantees on its
predictions. In this paper we define a robustness measure for Bayesian
inference against input perturbations, given by the probability that, for a
test point and a compact set in the input space containing the test point, the
prediction of the learning model will remain $\delta-$close for all the points
in the set, for $\delta>0.$ Such measures can be used to provide formal
guarantees for the absence of adversarial examples. By employing the theory of
Gaussian processes, we derive tight upper bounds on the resulting robustness by
utilising the Borell-TIS inequality, and propose algorithms for their
computation. We evaluate our techniques on two examples, a GP regression
problem and a fully-connected deep neural network, where we rely on weak
convergence to GPs to study adversarial examples on the MNIST dataset.

本文探讨了对于 Bayesian 推断模型的输入扰动的鲁棒性估计问题，通过使用高斯过程理论并提出算法计算当前模型在输入空间中的紧密强度，并应用于两个例子中：一个 GP 回归问题和一个全连接深度神经网络来研究 MNIST 数据集上的对抗性例子。