TL;DR本文提出一种计算信息论预期奖励的方法,运用信息量(mutual information)进行降维,开发出一个Sequential Monte Carlo (SMC)估计器,以避免未来信仰表面的重建,并将此方法应用于信息规划优化问题,最后在活动 SLAM问题的模拟中评估该方法。
Abstract
One of the most complex tasks of decision making and planning is to gather information. This task becomes even more complex when the state is high-dimensional and its belief cannot be expressed with a parametric distribution. Although the state is high-dimensional, in many problems only a small fraction of it might be involved in transitioning the state and