BriefGPT.xyz
Jul, 2023
指令采集:大型语言模型高质量指令数据选取
Instruction Mining: High-Quality Instruction Data Selection for Large Language Models
HTML
PDF
Yihan Cao, Yanbin Kang, Lichao Sun
TL;DR
本文提出了InstructMining用于评估指令遵循数据的质量,并使用该方法选择高质量数据进行Fei调。研究结果表明,使用InstructMining所选择的数据集表现出更优的性能。
Abstract
large language models
typically undergo two training stages,
pretraining
and
finetuning
. Despite that large-scale
→