上下文推荐系统中的估计问题

Nov, 2017

Estimation Considerations in Contextual Bandits

Maria Dimakopoulou, Susan Athey, Guido Imbens

TL;DR通过整合因果推断文献中的平衡方法，开发了参数和非参数上下文强化学习算法来实现对初始估计偏差问题的更少敏感性，并在域上提供了带有平衡的上下文强化学习的第一个遗憾界分析

Abstract

Contextual bandit algorithms seek to learn a personalized treatment assignment policy, balancing exploration against exploitation. Although a number of algorithms have been proposed, there is little guidance available for applied researchers to select among various approaches. Motivate