使用乐观本地转换的近乎最优 BRL

Jun, 2012

Near-Optimal BRL using Optimistic Local Transitions

Mauricio Araya, Olivier Buffet, Vincent Thomas

TL;DR介绍了一种基于模型的贝叶斯强化学习（BRL）算法BOLT，并分析了其样本复杂度，展示了该算法与以往方法的区别及其优越性。

Abstract

Model-based bayesian reinforcement learning (BRL) allows a found formalization of the problem of acting optimally while facing an unknown environment, i.e., avoiding the exploration-exploitation dilemma. However, algorithms explicitly addressing BRL suffer from such a combinatorial exp