Yuta Saito, Shunsuke Aihara, Megumi Matsutani, Yusuke Narita
TL;DR介绍了 Open Bandit Dataset 数据集和 Python 软件 Open Bandit Pipeline,可以用于评估虚拟策略的表现和不同 OPE 统计方法的比较,从而促进 OPE 研究的公正透明和可重复性。
Abstract
We build and publicize the Open Bandit Dataset and Pipeline to facilitate scalable and reproducible research on bandit algorithms. They are especially suitable for off-policy evaluation (OPE), which attempts to predict the performance of hypothetical algorithms using data generated by