BriefGPT.xyz
Mar, 2019
对漂移进行对冲:在非稳态环境下学习优化
Hedging the Drift: Learning to Optimize under Non-Stationarity
HTML
PDF
Wang Chi Cheung, David Simchi-Levi, Ruihao Zhu
TL;DR
介绍针对非静态赌博机环境的最新数据驱动决策算法,采用了随机和对手式学习算法的非传统结合方法,通过滑动窗口-置信界算法,针对各种非静态赌博机问题实现了最优动态遗憾边界,并通过数字实验验证了算法的超越性能。
Abstract
We introduce general
data-driven decision-making
algorithms that achieve state-of-the-art \emph{dynamic regret} bounds for
non-stationary bandit settings
. It captures applications such as advertisement allocation
→