Retrieval-augmented generation (RAG) utilizes retrieved texts to enhance large language models (LLMs). However, studies show that RAG is not consistently effective and can even mislead LLMs due to noisy or incorrect retrieved texts. This suggests that RAG possesses a duality including both benefit and detriment. Although many existing methods attempt to address this issue, they lack a theoretical explanation for the duality in RAG. The benefit and detriment within this duality remain a black box that cannot be quantified or compared in an explainable manner. This paper takes the first step in theoretically giving the essential explanation of benefit and detriment in RAG by: (1) decoupling and formalizing them from RAG prediction, (2) approximating the gap between their values by representation similarity and (3) establishing the trade-off mechanism between them, to make them explainable, quantifiable, and comparable. We demonstrate that the distribution difference between retrieved texts and LLMs' knowledge acts as double-edged sword, bringing both benefit and detriment. We also prove that the actual effect of RAG can be predicted at token level. Based on our theory, we propose a practical novel method, X-RAG, which achieves collaborative generation between pure LLM and RAG at token level to preserve benefit and avoid detriment. Experiments in real-world tasks based on LLMs including OPT, LLaMA-2, and Mistral show the effectiveness of our method and support our theoretical results.

使用检索增强生成（RAG）的方法将检索到的文本用于增强大型语言模型（LLM）。然而，研究显示RAG并不一致有效，甚至可能因检索到的文本含有噪声或错误而误导LLM，这表明RAG具有双重性，既有益又有害。本研究分离和形式化RAG的益处和害处，通过表征相似度来近似它们之间的差距，并建立它们之间的权衡机制，使其可解释、可量化和可比较。根据我们的理论，提出了一种实用的新方法X-RAG，在标记级别上实现纯LLM和RAG的协同生成，以保留好处和避免害处。基于OPT、LLaMA-2和Mistral的LLMs的实验表明了我们方法的有效性并支持了我们的理论结果。

揭示双重检索增强生成的理论分析与实践解决方案