The unparalleled performance of closed-sourced ChatGPT has sparked efforts towards its democratization, with notable strides made by leveraging real user and ChatGPT conversations, as evidenced by Vicuna. However, while current endeavors like Baize and UltraChat aim to auto-generate conversational data due to challenges in gathering human participation, they primarily rely on ChatGPT to simulate human behaviors based on directives rather than genuine human learning. This results in a limited scope, diminished diversity, and an absence of genuine multi-round conversational dynamics. To address the above issues, we innovatively target human questions extracted from genuine human-machine conversations as a learning goal and train a user simulator, UserGPT, to produce a high-quality human-centric synthetic conversation dataset, RealChat. Subsequently, this dataset trains our assistant model, ReaLM. Experimentally, ReaLM outpaces baseline models in both Vicuna-Bench and MT-Bench by pairwise comparison when considering equivalent training set sizes, and manual evaluation also shows that our model is highly competitive. Impressively, when fine-tuned with the latest LLaMA 2 model, ReaLM secured a leading score of 6.33 in the MT-Bench, outshining the contemporary same-scale models, including the LLaMA-2-7B-chat model. Further in-depth analysis demonstrates the scalability and transferability of our approach. A preliminary exploration into the interplay between training set data quality and resultant model performance is also undertaken, laying a robust groundwork for future investigations.

通过从真实人机对话中提取的人类问题作为学习目标，我们训练出了用户模拟器UserGPT，产生了高质量的以人为中心的合成对话数据集RealChat。实验结果表明，我们的模型在Vicuna-Bench和MT-Bench中优于基线模型，手动评估也表明我们的模型具有极高的竞争力。通过与最新LLaMA 2模型进行微调，ReaLM在MT-Bench中获得了6.33的领先分数，超过了其他同等规模的模型，包括LLaMA-2-7B-chat模型。我们的方法还展示了可扩展性和可迁移性，并对训练集数据质量与模型性能之间的相互作用进行了初步探索，为未来的研究奠定了坚实的基础。

大型语言模型作为用户模拟器