This paper considers two fundamental sequential decision-making problems: the problem of prediction with expert advice and the multi-armed bandit problem. We focus on stochastic regimes in which an adversary may corrupt losses, and we investigate what level of robustness can be achieved against adversarial corruptions. The main contribution of this paper is to show that optimal robustness can be expressed by a square-root dependency on the amount of corruption. More precisely, we show that two classes of algorithms, anytime Hedge with decreasing learning rate and algorithms with second-order regret bounds, achieve $O( \frac{\log N}{\Delta} + \sqrt{ \frac{C \log N }{\Delta} } )$-regret, where $N, \Delta$, and $C$ represent the number of experts, the gap parameter, and the corruption level, respectively. We further provide a matching lower bound, which means that this regret bound is tight up to a constant factor. For the multi-armed bandit problem, we also provide a nearly tight lower bound up to a logarithmic factor.

论文研究了预测问题和多臂老虎机问题两个具有序列决策的基本问题。特别地，我们关注当对手可能篡改损失时的随机机制，并研究能够实现的鲁棒性水平。本文的主要贡献在于表明，最佳鲁棒性可以通过对所涉及的污染量的平方根依赖来表达。此外，我们还提供了下限，表明上述遗憾边界是紧的。最后，对于多臂老虎机问题，我们还提供了一个近似紧密的下限。

在线决策问题中关于对抗性破坏的最佳鲁棒性