对抗多臂赌博机中实现隐私保护

Jan, 2017

Achieving Privacy in the Adversarial Multi-Armed Bandit

Aristide C. Y. Tossou, Christos Dimitrakakis

TL;DR本文提出了一种结合拉普拉斯机制和EXP3的算法，在对抗性赌徒环境中实现ε差分隐私，并将最佳已知遗憾界从O(T^(3/4))提高到了O(T^(2/3))，同时达到了O(√T ln T/ε)的决策精度，其在自适应对手中具有良好的鲁棒性，并进行了实验验证。

Abstract

In this paper, we improve the previously best known regret bound to achieve $\epsilon$-differential privacy in oblivious adversarial bandits