Joel Veness, Kee Siong Ng, Marcus Hutter, David Silver
TL;DR本文介绍了一种可扩展的通用增强学习代理的设计方法,该方法基于对AIXI的直接逼近,利用Monte Carlo Tree Search算法和Context Tree Weighting算法的代理特定扩展得以实现,实验表明该算法在多个随机、未知和部分可观察的领域中表现良好。
Abstract
This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. This approach is based on a direct approximation of aixi, a Bayesian optimality notion for general