BriefGPT.xyz
Dec, 2023
线性上下文强化学习的最佳算法
Best-of-Both-Worlds Algorithms for Linear Contextual Bandits
HTML
PDF
Yuko Kuroki, Alberto Rumi, Taira Tsuchiya, Fabio Vitale, Nicolò Cesa-Bianchi
TL;DR
我们研究了针对K臂线性情境赌博机的最佳算法,无需先前对环境有所了解,在敌对和随机的情境下都能够提供接近最优的后悔边界。
Abstract
We study best-of-both-worlds
algorithms
for $K$-armed linear
contextual bandits
. Our
algorithms
deliver near-optimal
→