In this paper, we carry out experimental research on Grammatical Error Correction, delving into the nuances of single-model systems, comparing the efficiency of ensembling and ranking methods, and exploring the application of large language models to GEC as single-model systems, as parts of ensembles, and as ranking methods. We set new state-of-the-art performance with F_0.5 scores of 72.8 on CoNLL-2014-test and 81.4 on BEA-test, respectively. To support further advancements in GEC and ensure the reproducibility of our research, we make our code, trained models, and systems' outputs publicly available.

本文通过实验研究语法错误修正，深入研究单模型系统的细微差别，比较集成和排名方法的效率，并探讨了大型语言模型在作为单模型系统、集成部分和排名方法时在语法错误修正上的应用。我们在CoNLL-2014-test和BEA-test上分别取得了F_0.5分数为72.8和81.4的最新最佳性能，为GEC的进一步发展和我们研究的可重复性提供支持。同时我们公开了我们的代码、训练模型和系统的输出结果，以便进一步推动GEC的发展。

大语言模型时代语法错误修正的支柱：针对现代方法的全面考察