Enhancing the instruction-following ability of Large Language Models (LLMs) primarily demands substantial instruction-tuning datasets. However, the sheer volume of these imposes a considerable computational burden and annotation cost. To investigate a label-efficient instruction tuning method that allows the model itself to actively sample subsets that are equally or even more effective, we introduce a self-evolving mechanism DiverseEvol. In this process, a model iteratively augments its training subset to refine its own performance, without requiring any intervention from humans or more advanced LLMs. The key to our data sampling technique lies in the enhancement of diversity in the chosen subsets, as the model selects new data points most distinct from any existing ones according to its current embedding space. Extensive experiments across three datasets and benchmarks demonstrate the effectiveness of DiverseEvol. Our models, trained on less than 8% of the original dataset, maintain or improve performance compared with finetuning on full data. We also provide empirical evidence to analyze the importance of diversity in instruction data and the iterative scheme as opposed to one-time sampling. Our code is publicly available at https://github.com/OFA-Sys/DiverseEvol.git.

通过引入自我演变机制DiverseEvol，我们提出了一种标签高效的指令调整方法，该方法允许模型自己主动采样同样或更有效的子集来改善自身性能，而无需人类干预或更先进的LLMs。在选择子集时，我们的数据采样技术的关键在于增强所选子集的多样性，使模型根据当前的嵌入空间选择与任何现有数据点都不同的新数据点。在三个数据集和基准测试中进行的大量实验证明了DiverseEvol的有效性。我们的模型在原始数据集的不到8%的训练基础上，与在完整数据上进行微调相比，性能保持或提高。我们还提供实证证据分析了多样性在指令数据中的重要性以及迭代方案与一次性采样的区别。我们的代码可以在此https URL公开获取。

自主演化多样化数据采样用于高效指导调优