BriefGPT.xyz
Feb, 2022
简单后悔最小化的元学习
Meta-Learning for Simple Regret Minimization
HTML
PDF
Mohammadjavad Azizi, Branislav Kveton, Mohammad Ghavamzadeh, Sumeet Katariya
TL;DR
一个meta-learning框架有效的解决了bandit任务中的regret minimization问题,提出了贝叶斯和频率主义算法,评估了不同的环境。
Abstract
We develop a
meta-learning
framework for simple
regret minimization
in bandits. In this framework, a learning agent interacts with a sequence of
→