BriefGPT.xyz
May, 2024
超越模仿:从推理涤纶中学习关键推理步骤的双重思维链
Beyond Imitation: Learning Key Reasoning Steps from Dual Chain-of-Thoughts in Reasoning Distillation
HTML
PDF
Chengwei Dai, Kun Li, Wei Zhou, Songlin Hu
TL;DR
使用基于错误的驱动关键推理步骤蒸馏(EDIT)方法,可以更有效地帮助小型语言模型学习重要的推理步骤,而不仅仅是简单的微调,验证了其在基准推理数据集上的有效性。
Abstract
As
large language models
(LLMs) scale up and gain powerful Chain-of-Thoughts (CoTs) reasoning abilities, practical resource constraints drive efforts to distill these capabilities into more compact
smaller language mode
→