BriefGPT.xyz
Apr, 2019
基于在线离线实验的策略搜索贝叶斯优化
Bayesian Optimization for Policy Search via Online-Offline Experimentation
HTML
PDF
Benjamin Letham, Eytan Bakshy
TL;DR
使用离线模拟器并应用多任务贝叶斯优化改进在线机器学习系统的方法,较之仅进行在线实验,能够更有效地探索复杂、多维度的策略空间,并通过学习曲线表明离线实验可以显著提高在线实验结果的准确性和优化速度。
Abstract
online field experiments
are the gold-standard way of evaluating changes to real-world interactive
machine learning
systems. Yet our ability to explore complex, multi-dimensional policy spaces - such as those fou
→