无限臂线性情境赌博机的紧束悔恨界

May, 2019

无限臂线性情境赌博机的紧束悔恨界

Tight Regret Bounds for Infinite-armed Linear Contextual Bandits

Yingkai Li, Yining Wang, Yuan Zhou

TL;DR本文研究线性上下文赌博机，特别是具有更改的无穷动作集的情况下的情况。我们证明了一种悔恨上界，其与以前的下界相匹配。

Abstract

linear contextual bandit is a class of sequential decision making problems with important applications in recommendation systems, online advertising, healthcare, and other machine learning related tasks. While there is much prior research, tight →