This paper presents the first comprehensive analysis of ChatGPT's Text-to-SQL ability. Given the recent emergence of large-scale conversational language model ChatGPT and its impressive capabilities in both conversational abilities and code generation, we sought to evaluate its Text-to-SQL performance. We conducted experiments on 12 benchmark datasets with different languages, settings, or scenarios, and the results demonstrate that ChatGPT has strong text-to-SQL abilities. Although there is still a gap from the current state-of-the-art (SOTA) model performance, considering that the experiment was conducted in a zero-shot scenario, ChatGPT's performance is still impressive. Notably, in the ADVETA (RPL) scenario, the zero-shot ChatGPT even outperforms the SOTA model that requires fine-tuning on the Spider dataset by 4.1\%, demonstrating its potential for use in practical applications. To support further research in related fields, we have made the data generated by ChatGPT publicly available at https://github.com/THU-BPM/chatgpt-sql.

本文介绍了 ChatGPT 在 Text-to-SQL 能力上的综合分析，使用了 12 个基准数据集进行实验，结果表明 ChatGPT 在 Text-to-SQL 上有着强大的能力，在零样本情况下甚至超过了 SOTA 模型，在实际应用中具有潜在价值，并公开了生成的数据集。

ChatGPT零-shot文本到SQL能力的综合评估