BriefGPT.xyz
Dec, 2021
置信度感知的多教师知识蒸馏
Confidence-Aware Multi-Teacher Knowledge Distillation
HTML
PDF
Hailin Zhang, Defang Chen, Can Wang
TL;DR
该研究提出了一种自适应分配逐样本可靠度的方法, 以每个教师的预测可信度来稳定知识转移过程, 并结合中间层来提高学生成绩, 在不同的教师-学生架构下, 优于所有其他现有方法。
Abstract
knowledge distillation
is initially introduced to utilize additional supervision from a single teacher model for the student model training. To boost the
student performance
, some recent variants attempt to explo
→