BriefGPT.xyz
May, 2021
探索知识蒸馏
Towards Understanding Knowledge Distillation
HTML
PDF
Mary Phuong, Christoph H. Lampert
TL;DR
本研究通过研究线性和深度线性分类器的特殊情况,证明了知识蒸馏在理论上的有效性,并揭示了决定其成功的三个关键因素:数据几何形态、优化偏差和强单调性。
Abstract
knowledge distillation
, i.e., one classifier being trained on the outputs of another classifier, is an empirically very successful technique for knowledge transfer between
classifiers
. It has even been observed t
→