恢复强盗

Oct, 2019

Recovering Bandits

Ciara Pike-Burke, Steffen Grünewälder

TL;DR研究回收匪徒问题，使用高斯过程解决估计和规划问题，包括悔恨界限和计算效率的讨论。

Abstract

We study the recovering bandits problem, a variant of the stochastic multi-armed bandit problem where the expected reward of each arm varies according to some unknown function of the time since the arm was last p