Large language models (LLMs) have demonstrated notable proficiency in code generation, with numerous prior studies showing their promising capabilities in various development scenarios. However, these studies mainly provide evaluations in research settings, which leaves a significant gap in understanding how effectively LLMs can support developers in real-world. To address this, we conducted an empirical analysis of conversations in DevGPT, a dataset collected from developers' conversations with ChatGPT (captured with the Share Link feature on platforms such as GitHub). Our empirical findings indicate that the current practice of using LLM-generated code is typically limited to either demonstrating high-level concepts or providing examples in documentation, rather than to be used as production-ready code. These findings indicate that there is much future work needed to improve LLMs in code generation before they can be integral parts of modern software development.

大型语言模型在代码生成方面展示了显著的熟练度，并通过许多先前的研究在各种开发场景中显示了它们的有希望的能力。然而，这些研究主要在研究环境中进行评估，这在了解LLMs在实际开发中如何有效地支持开发人员方面存在重大差距。通过对来自开发人员与ChatGPT的对话的数据集DevGPT进行实证分析，我们的实证发现表明，目前使用LLM生成的代码的实践通常仅限于展示高级概念或在文档中提供示例，而不是用于生产就绪的代码。这些发现表明，在LLMs成为现代软件开发的重要组成部分之前，还需开展大量未来工作来改进LLMs在代码生成方面的能力。

ChatGPT是否能够支持开发者？对于代码生成的大型语言模型的实证评估