We proposed an end-to-end system design towards utilizing Retrieval Augmented Generation (RAG) to improve the factual accuracy of Large Language Models (LLMs) for domain-specific and time-sensitive queries related to private knowledge-bases. Our system integrates RAG pipeline with upstream datasets processing and downstream performance evaluation. Addressing the challenge of LLM hallucinations, we finetune models with a curated dataset which originates from CMU's extensive resources and annotated with the teacher model. Our experiments demonstrate the system's effectiveness in generating more accurate answers to domain-specific and time-sensitive inquiries. The results also revealed the limitations of fine-tuning LLMs with small-scale and skewed datasets. This research highlights the potential of RAG systems in augmenting LLMs with external datasets for improved performance in knowledge-intensive tasks. Our code and models are available on Github.

我们提出了一种朝着利用检索增强生成（RAG）改进大规模语言模型（LLMs）对私人知识库相关的领域特定和时间敏感查询的事实准确性的端到端系统设计。我们的系统将RAG流水线与上游数据集处理和下游性能评估集成在一起。通过使用源自CMU广泛资源并以教师模型进行注释的策划数据集对模型进行微调，解决了LLM产生的幻觉挑战。我们的实验表明该系统在生成更准确的领域特定和时间敏感查询答案方面的有效性。结果还揭示了使用规模较小和偏斜的数据集进行微调LLM的限制。这项研究突出了RAG系统在增强LLMs表现方面的潜力在知识密集型任务中。我们的代码和模型可在Github上找到。

利用RAG提高LLM事实准确性以应对幻觉：私有知识库中领域特定查询的案例研究