学习还是自我调整？重新思考指令微调

Feb, 2024

学习还是自我调整？重新思考指令微调

Learning or Self-aligning? Rethinking Instruction Fine-tuning

Mengjie Ren, Boxi Cao, Hongyu Lin, Liu Cao, Xianpei Han...

TL;DR通过知识干预框架，我们揭示了指导微调的潜在机制，并为最近和可能的未来工作提供了强有力的支持。

Abstract

instruction fine-tuning~(IFT) is a critical phase in building large language models~(LLMs). Previous works mainly focus on the IFT's role in the transfer of behavioral norms and the learning of additional