Upon its release in late 2022, ChatGPT has brought a seismic shift in the entire landscape of AI, both in research and commerce. Through instruction-tuning a large language model (LLM) with supervised fine-tuning and reinforcement learning from human feedback, it showed that a model could answer human questions and follow instructions on a broad panel of tasks. Following this success, interests in LLMs have intensified, with new LLMs flourishing at frequent interval across academia and industry, including many start-ups focused on LLMs. While closed-source LLMs (e.g., OpenAI's GPT, Anthropic's Claude) generally outperform their open-source counterparts, the progress on the latter has been rapid with claims of achieving parity or even better on certain tasks. This has crucial implications not only on research but also on business. In this work, on the first anniversary of ChatGPT, we provide an exhaustive overview of this success, surveying all tasks where an open-source LLM has claimed to be on par or better than ChatGPT.

2022年底，ChatGPT的发布在AI的研究和商业领域引发了巨大的风潮，通过使用监督微调和强化学习来对大型语言模型进行指令调优，它展示了模型能够回答人类提出的问题并按照广泛的任务进行指令遵循，使得大型语言模型的研究兴趣得到了极大的加强，各种新的大型语言模型层出不穷，包括很多专注于大型语言模型的初创公司。然而，尽管封闭源的大型语言模型（如OpenAI的GPT和Anthropic的Claude）通常表现出色，但开源大型语言模型的进展也非常迅速，并声称在某些任务上实现了与ChatGPT持平甚至更好的结果，这对于研究和商业都具有重要的意义。在本研究中，我们在ChatGPT发布一周年之际，全面概述了这一成就，并调查了所有开源大型语言模型声称在各项任务中达到与ChatGPT持平或更好的情况。

ChatGPT的一周年：开源大规模语言模型是否在迎头赶上？