BriefGPT.xyz
Mar, 2024
DP-Dueling:从偏好反馈学习而不损害用户隐私
DP-Dueling: Learning from Preference Feedback without Compromising User Privacy
HTML
PDF
Aadirupa Saha, Hilal Asi
TL;DR
在差分隐私的约束下,我们提出了一种首个能够保护用户偏好的活跃学习的差分隐私决策竞争算法,具有接近最优性能的高效计算能力与遗憾边界。
Abstract
We consider the well-studied
dueling bandit problem
, where a learner aims to identify near-optimal actions using pairwise comparisons, under the constraint of
differential privacy
. We consider a general class of
→