Instruction tuning as an effective technique aligns the outputs of large language models (LLMs) with human preference. But how to generate the seasonal multi-turn dialogues from raw documents for instruction tuning still requires further exploration. In this paper, we present a novel framework named R2S that leverages the CoD-Chain of Dialogue logic to guide large language models (LLMs) in generating knowledge-intensive multi-turn dialogues for instruction tuning. By integrating raw documents from both open-source datasets and domain-specific web-crawled documents into a benchmark K-BENCH, we cover diverse areas such as Wikipedia (English), Science (Chinese), and Artifacts (Chinese). Our approach first decides the logic flow of the current dialogue and then prompts LLMs to produce key phrases for sourcing relevant response content. This methodology enables the creation of the G I NSTRUCT instruction dataset, retaining raw document knowledge within dialoguestyle interactions. Utilizing this dataset, we fine-tune GLLM, a model designed to transform raw documents into structured multi-turn dialogues, thereby injecting comprehensive domain knowledge into the SFT model for enhanced instruction tuning. This work signifies a stride towards refining the adaptability and effectiveness of LLMs in processing and generating more accurate, contextually nuanced responses across various fields.

通过利用对话逻辑在生成大型语言模型的季节性多轮对话中的原始文档进行指令调整，本文介绍了一种名为R2S的新颖框架，该框架整合了开放源代码数据集和领域特定网络爬行文档的原始文档来创建基准K-BENCH，涵盖了维基百科（英文）、科学（中文）和手工艺品（中文）等多样的领域，从而在指令调整中注入了广泛的领域知识，提高了SFT模型的适应性和效果。

原始文本就是您所需的：大规模语言模型的知识密集型多轮指导调优