Informative path planning (IPP) is a crucial task in robotics, where agents must design paths to gather valuable information about a target environment while adhering to resource constraints. Reinforcement learning (RL) has been shown to be effective for IPP, however, it requires environment interactions, which are risky and expensive in practice. To address this problem, we propose an offline RL-based IPP framework that optimizes information gain without requiring real-time interaction during training, offering safety and cost-efficiency by avoiding interaction, as well as superior performance and fast computation during execution -- key advantages of RL. Our framework leverages batch-constrained reinforcement learning to mitigate extrapolation errors, enabling the agent to learn from pre-collected datasets generated by arbitrary algorithms. We validate the framework through extensive simulations and real-world experiments. The numerical results show that our framework outperforms the baselines, demonstrating the effectiveness of the proposed approach.

本研究解决了传统信息路径规划在环境交互中存在的风险和成本问题，提出了一种基于离线强化学习的新框架。该框架通过优化信息获取，利用批约束强化学习从预先收集的数据集中学习，有效减少了外推误差。实验证明，该方法在性能和计算速度上优于现有基线，具有重要的应用潜力。

离线RL基础的信息路径规划