BriefGPT.xyz
Jun, 2015
Thompson采样的先验敏感性
On the Prior Sensitivity of Thompson Sampling
HTML
PDF
Che-Yu Liu, Lihong Li
TL;DR
本文深入分析了Thompson Sampling算法中对先验分布选择的鲁棒性, 发现在选择优先概率质量时, 其遗憾上限与先验正判度呈O(√T/p), 先验负判度呈O(√(1-p)T), 并利用这些性质提出了一种基于鞅理论的新证明方法。
Abstract
The empirically successful
thompson sampling
algorithm for
stochastic bandits
has drawn much interest in understanding its theoretical properties. One important benefit of the algorithm is that it allows domain k
→