Large language models (LLMs) demonstrate powerful capabilities, but they still face challenges in practical applications, such as hallucinations, slow knowledge updates, and lack of transparency in answers. Retrieval-Augmented Generation (RAG) refers to the retrieval of relevant information from external knowledge bases before answering questions with LLMs. RAG has been demonstrated to significantly enhance answer accuracy, reduce model hallucination, particularly for knowledge-intensive tasks. By citing sources, users can verify the accuracy of answers and increase trust in model outputs. It also facilitates knowledge updates and the introduction of domain-specific knowledge. RAG effectively combines the parameterized knowledge of LLMs with non-parameterized external knowledge bases, making it one of the most important methods for implementing large language models. This paper outlines the development paradigms of RAG in the era of LLMs, summarizing three paradigms: Naive RAG, Advanced RAG, and Modular RAG. It then provides a summary and organization of the three main components of RAG: retriever, generator, and augmentation methods, along with key technologies in each component. Furthermore, it discusses how to evaluate the effectiveness of RAG models, introducing two evaluation methods for RAG, emphasizing key metrics and abilities for evaluation, and presenting the latest automatic evaluation framework. Finally, potential future research directions are introduced from three aspects: vertical optimization, horizontal scalability, and the technical stack and ecosystem of RAG.

大型语言模型（LLMs）在实际应用中仍面临幻觉、知识更新缓慢和答案透明度不足等挑战。检索增强生成（RAG）是指在LLMs回答问题之前从外部知识库中检索相关信息。该论文概述了LLMs时代RAG的发展范式，总结了三种范式：Naive RAG，Advanced RAG和Modular RAG。同时，它提供了RAG的三个主要组成部分：检索器、生成器和增强方法的摘要和组织，以及每个组件的关键技术。此外，论文讨论了如何评估RAG模型的有效性，并介绍了两种RAG的评估方法、重点指标和能力，以及最新的自动评估框架。最后，从垂直优化、水平可扩展性和RAG的技术堆栈和生态系统三个方面引入了潜在的未来研究方向。

大语言模型的检索增强生成：综述