BriefGPT.xyz
Mar, 2014
汤普森抽样的信息论分析
An Information-Theoretic Analysis of Thompson Sampling
HTML
PDF
Daniel Russo, Benjamin Van Roy
TL;DR
本文提供一种信息论分析Thompson采样的方式,适用于许多在线优化问题,其中决策者必须从部分反馈中学习,分析继承信息论的简单性和优雅性,并导致与最优行动分布熵成比例的后悔界限,这加强了现有的成果并揭示了信息如何提高性能。
Abstract
We provide an information-theoretic analysis of
thompson sampling
that applies across a broad range of
online optimization
problems in which a decision-maker must learn from
→