Retrieval-Augmented Generation (RAG) is widely adopted for its effectiveness and cost-efficiency in mitigating hallucinations and enhancing the domain-specific generation capabilities of large language models (LLMs). However, is this effectiveness and cost-efficiency truly a free lunch? In this study, we comprehensively investigate the fairness costs associated with RAG by proposing a practical three-level threat model from the perspective of user awareness of fairness. Specifically, varying levels of user fairness awareness result in different degrees of fairness censorship on the external dataset. We examine the fairness implications of RAG using uncensored, partially censored, and fully censored datasets. Our experiments demonstrate that fairness alignment can be easily undermined through RAG without the need for fine-tuning or retraining. Even with fully censored and supposedly unbiased external datasets, RAG can lead to biased outputs. Our findings underscore the limitations of current alignment methods in the context of RAG-based LLMs and highlight the urgent need for new strategies to ensure fairness. We propose potential mitigations and call for further research to develop robust fairness safeguards in RAG-based LLMs.

本研究探讨了检索增强生成（RAG）对大型语言模型（LLMs）公平性的影响，揭示了在用户对公平性意识的不同水平下，RAG如何导致不公平结果。我们通过实验证明，即使在完全审查和声称无偏的数据集上，RAG也会导致有偏见的输出，因此需要新的策略来确保公平性。

没有免费的午餐：检索增强生成削弱了大型语言模型的公平性，即使对于警觉的用户