BriefGPT.xyz
Nov, 2021
线性上下文强化学习模型选择的通用自适应算法
Universal and data-adaptive algorithms for model selection in linear contextual bandits
HTML
PDF
Vidya Muthukumar, Akshay Krishnamurthy
TL;DR
在上下文强化学习中进行模型选择是一项重要的补充问题。本研究提出了一些新的算法,这些算法可以在数据自适应的情况下进行探索,并提供模型选择保证。
Abstract
model selection
in
contextual bandits
is an important complementary problem to regret minimization with respect to a fixed model class. We consider the simplest non-trivial instance of model-selection: distinguis
→