Retrieval-augmented generation (RAG) has shown great potential for knowledge-intensive tasks, but its traditional architectures rely on static retrieval, limiting their effectiveness for complex questions that require sequential information-seeking. While agentic reasoning and search offer a more adaptive approach, most existing methods depend heavily on prompt engineering. In this work, we introduce RAG-Gym, a unified optimization framework that enhances information-seeking agents through fine-grained process supervision at each search step. We also propose ReSearch, a novel agent architecture that synergizes answer reasoning and search query generation within the RAG-Gym framework. Experiments on four challenging datasets show that RAG-Gym improves performance by up to 25.6\% across various agent architectures, with ReSearch consistently outperforming existing baselines. Further analysis highlights the effectiveness of advanced LLMs as process reward judges and the transferability of trained reward models as verifiers for different LLMs. Additionally, we examine the scaling properties of training and inference in agentic RAG. The project homepage is available at https://rag-gym.github.io/.

本研究旨在解决传统RAG架构在处理复杂问题时依赖静态检索的局限性。提出了RAG-Gym这一统一优化框架，通过细致的过程监督提高信息获取代理的能力，并创新性地引入ReSearch架构，实现答案推理与搜索查询生成的协同。实验结果表明，RAG-Gym在多个代理架构上性能提升达25.6%，显示了先进大语言模型作为过程奖励评判者的有效性以及训练奖励模型在不同大语言模型中的可迁移性。

RAG-Gym：通过过程监督优化推理和搜索代理