利用随机赌博机的侧观察优化

Oct, 2012

Leveraging Side Observations in Stochastic Bandits

Stephane Caron, Branislav Kveton, Marc Lelarge, Smriti Bhagat

TL;DR本文提出一种考虑了副观测数据的随机赌博机模型，并基于上界置信度 (UCBs) 提供了高效的算法，用于在社交网络中推荐内容，实现了比传统算法更好的效果。

Abstract

This paper considers stochastic bandits with side observations, a model that accounts for both the exploration/exploitation dilemma and relationships between arms. In this setting, after pulling an arm i, the dec