BriefGPT.xyz
Jul, 2019
相似性保持知识蒸馏
Similarity-Preserving Knowledge Distillation
HTML
PDF
Frederick Tung, Greg Mori
TL;DR
本文提出了新型的知识蒸馏损失函数,其通过保留教师神经网络中相似输入的激活模式特征,指导学生神经网络的训练,使其在保留各自的表征空间中,能够准确地保持输入的相似度。实验结果表明了该方法的潜力。
Abstract
knowledge distillation
is a widely applicable technique for training a student neural network under the guidance of a trained teacher network. For example, in
neural network compression
, a high-capacity teacher i
→