BriefGPT.xyz
Mar, 2018
半参数情境赌博机
Semiparametric Contextual Bandits
HTML
PDF
Akshay Krishnamurthy, Zhiwei Steven Wu, Vasilis Syrgkanis
TL;DR
该论文研究了半参数上下文赌博机问题,设计了新的算法来解决非线性混淆影响下的奖励估计问题,并通过实证评估证明了该算法的有效性。
Abstract
This paper studies
semiparametric contextual bandits
, a generalization of the linear stochastic bandit problem where the reward for an action is modeled as a
linear function
of known action features confounded by
→