Considerable efforts have been invested in augmenting the role-playing proficiency of open-source large language models (LLMs) by emulating proprietary counterparts. Nevertheless, we posit that LLMs inherently harbor role-play capabilities, owing to the extensive knowledge of characters and potential dialogues ingrained in their vast training corpora. Thus, in this study, we introduce Ditto, a self-alignment method for role-play. Ditto capitalizes on character knowledge, encouraging an instruction-following LLM to simulate role-play dialogues as a variant of reading comprehension. This method creates a role-play training set comprising 4,000 characters, surpassing the scale of currently available datasets by tenfold regarding the number of roles. Subsequently, we fine-tune the LLM using this self-generated dataset to augment its role-playing capabilities. Upon evaluating our meticulously constructed and reproducible role-play benchmark and the roleplay subset of MT-Bench, Ditto, in various parameter scales, consistently maintains a consistent role identity and provides accurate role-specific knowledge in multi-turn role-play conversations. Notably, it outperforms all open-source role-play baselines, showcasing performance levels comparable to advanced proprietary chatbots. Furthermore, we present the first comprehensive cross-supervision alignment experiment in the role-play domain, revealing that the intrinsic capabilities of LLMs confine the knowledge within role-play. Meanwhile, the role-play styles can be easily acquired with the guidance of smaller models. We open-source related resources at https://github.com/OFA-Sys/Ditto.

本研究通过利用大规模训练语料库中角色知识，提出了一种自对齐的角色扮演方法 Ditto，其将一个在读解问题上进行指令遵循的大型语言模型调整为模拟角色扮演对话。通过使用自动生成的角色扮演训练数据集对模型进行微调，Ditto展现出在多轮对话中一致的角色身份和准确的角色特定知识，性能高于其他开源角色扮演基准，并与先进的专有聊天机器人相媲美。与此同时，研究还展示了大型语言模型自身的内在能力限制了角色特定知识的获取，但通过辅助较小模型的指导可以轻松获得角色扮演风格。

大规模语言模型是所有字符的叠加：通过自我对齐实现任意角色扮演