BriefGPT.xyz
Feb, 2020
优化环境传递带宽安全探索
Safe Exploration for Optimizing Contextual Bandits
HTML
PDF
Rolf Jagerman, Ilya Markov, Maarten de Rijke
TL;DR
本文提出了一种名为SEA的新型学习方法,用于解决上下文乐观主义问题,它不会伤害用户体验,同时能够在探索空间中进行操作,从而有效地找到最佳策略。
Abstract
contextual bandit problems
are a natural fit for many
information retrieval
tasks, such as learning to rank,
text classification
, recommen
→