BriefGPT.xyz
Jan, 2023
监督复杂度及其在知识蒸馏中的作用
Supervision Complexity and its Role in Knowledge Distillation
HTML
PDF
Hrayr Harutyunyan, Ankit Singh Rawat, Aditya Krishna Menon, Seungyeon Kim, Sanjiv Kumar
TL;DR
本文提出了一个新的理论框架,探究了知识蒸馏的学生的普适性行为,并评估了在线蒸馏的效力。该框架突出了教师提供监督和学生的神经切线核之间的对齐度的复杂性之间微妙的相互作用,为蒸馏中流行的各种技术的效用提供了严密的理论基础。
Abstract
Despite the popularity and efficacy of
knowledge distillation
, there is limited understanding of why it helps. In order to study the generalization behavior of a distilled student, we propose a new theoretical framework that leverages
→