Supervised fine-tuning (SFT) on instruction-following corpus is a crucial
approach toward the alignment of large language models (LLMs). However, the
performance of LLMs on standard knowledge and reasoning benchmarks tends to
suffer from deterioration at the latter stage of the SFT process, echoing the
phenomenon of alignment tax. Through our pilot study, we put a hypothesis that
the data biases are probably one cause behind the phenomenon. To address the
issue, we introduce a simple disperse-then-merge framework. To be concrete, we
disperse the instruction-following data into portions and train multiple
sub-models using different data portions. Then we merge multiple models into a
single one via model merging techniques. Despite its simplicity, our framework
outperforms various sophisticated methods such as data curation and training
regularization on a series of standard knowledge and reasoning benchmarks.

通过我们的研究，我们提出一个假设：数据偏差可能是大型语言模型在细调过程的后期出现性能下降的原因之一。为了解决这个问题，我们引入了一个简单的分散然后合并的框架。尽管简单，我们的框架在一系列标准的知识和推理基准测试中优于各种复杂的方法。