蒙特卡罗AIXI近似

Sep, 2009

A Monte Carlo AIXI Approximation

Joel Veness, Kee Siong Ng, Marcus Hutter, David Silver

TL;DR本文介绍了一种基于贝叶斯优化概念的通用强化学习智能体的可扩展设计方法，提出了一种可行的 AIXI 智能体近似算法，并在随机和部分可观测领域上展示了一系列鼓舞人心的结果，最后提出了未来研究的方向。

Abstract

This paper describes a computationally feasible approximation to the AIXI agent, a universal reinforcement learning agent for arbitrary environments. AIXI is scaled down in two key ways: First, the class of environment models is restricted to all prediction suffix trees of a fixed maximum depth. This allows a Bayesian mixture of environment models to be comp