TL;DR文章提出了一个解决全面事实问题(grain of truth problem)的方法,其中Bayesian agent学习预测其他代理的策略,自适应Thompson sampling收敛于任意未知可计算多代理环境中的ε-Nash均衡。
Abstract
A bayesian agent acting in a multi-agent environment learns to predict the other agents' policies if its prior assigns positive probability to them (in other words, its prior contains a \emph{grain of truth}). Fi