BriefGPT.xyz
Jun, 2023
基于数据驱动的悔恨平衡在线模型选择策略
Data-Driven Regret Balancing for Online Model Selection in Bandits
HTML
PDF
Aldo Pacchiano, Christoph Dann, Claudio Gentile
TL;DR
研究随机环境下序列决策中模型选择的效用,并利用数据驱动方法得到候选后悔保证未知的基本学习方法,通过后悔平衡实现模型选择保证。
Abstract
We consider
model selection
for
sequential decision making
in
stochastic environments
with bandit feedback, where a meta-learner has at it
→