BriefGPT.xyz
Feb, 2019
一种新的非平稳情境赌博算法:高效、最优和免参数
A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free
HTML
PDF
Yifang Chen, Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei
TL;DR
提出了首个无需参数的、高效的、动态遗憾最优的上下文赌博算法,通过引入回放阶段来保持对非平稳的探索,并在探索和开发之间保持良好的平衡。
Abstract
We propose the first
contextual bandit
algorithm
that is parameter-free, efficient, and optimal in terms of
dynamic regret
. Specifically,
→