BriefGPT.xyz
Jul, 2017
Thompson抽样教程
A Tutorial on Thompson Sampling
HTML
PDF
Daniel Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband
TL;DR
该论文介绍了Thompson采样算法在处理在线决策问题,尤其是在平衡当前性能和收集信息提高未来性能之间的探索与利用上的应用。该算法适用于各种问题并具有高效的计算能力,具体例子包括伯努利老虎机问题,最短路径问题,推荐系统,主动学习等。此外,本文还讨论了Thompson采样算法何时有效、何时无效以及与其他算法的关系。
Abstract
thompson sampling
is an algorithm for
online decision problems
where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing t
→