A common way to extend the memory of large language models (LLMs) is by retrieval augmented generation (RAG), which inserts text retrieved from a larger memory into an LLM's context window. However, the context window is typically limited to several thousand tokens, which limits the number of retrieved passages that can inform a model's response. For this reason, it's important to avoid occupying context window space with redundant information by ensuring a degree of diversity among retrieved passages. At the same time, the information should also be relevant to the current task. Most prior methods that encourage diversity among retrieved results, such as Maximal Marginal Relevance (MMR), do so by incorporating an objective that explicitly trades off diversity and relevance. We propose a novel simple optimization metric based on relevant information gain, a probabilistic measure of the total information relevant to a query for a set of retrieved results. By optimizing this metric, diversity organically emerges from our system. When used as a drop-in replacement for the retrieval component of a RAG system, this method yields state-of-the-art performance on question answering tasks from the Retrieval Augmented Generation Benchmark (RGB), outperforming existing metrics that directly optimize for relevance and diversity.

大型语言模型（LLM）的记忆扩展常常通过检索增强的生成（RAG）实现，该方法将来自更大记忆的文本插入LLM的上下文窗口。我们提出了一种基于相关信息增益的新型简单优化指标，通过优化这个指标，多样性自然地从我们的系统中出现。当用作RAG系统的检索组件的替代品时，这种方法在检索增强生成基准（RGB）的问答任务中展现出了最先进的性能，超过了直接优化相关性和多样性的现有指标。

利用相关信息增益的改进RAG算法