BriefGPT.xyz
Sep, 2024
教练嵌入的线性投影用于少类蒸馏
Linear Projections of Teacher Embeddings for Few-Class Distillation
HTML
PDF
Noel Loo, Fotis Iliopoulos, Wei Hu, Erik Vee
TL;DR
本研究解决了知识蒸馏在二分类和少类问题表现不佳的局限。提出了一种新的方法——学习嵌入线性投影(LELP),通过识别教师模型的嵌入空间中的信息线性子空间并将其分割为伪子类,来有效地进行知识蒸馏。实验表明,LELP在大规模NLP基准上的性能优于现有的蒸馏算法,展示了其广泛的应用潜力。
Abstract
Knowledge Distillation
(KD) has emerged as a promising approach for transferring knowledge from a larger, more complex
Teacher Model
to a smaller student model. Traditionally, KD involves training the student to
→