BriefGPT.xyz
Sep, 2020
多教师助理指导的密集知识蒸馏
Densely Guided Knowledge Distillation using Multiple Teacher Assistants
HTML
PDF
Wonchul Son, Jaemin Na, Wonjun Hwang
TL;DR
本文提出一种基于多个teacher assistant的密集引导知识蒸馏方法,通过逐渐减小模型大小有效地弥合teacher和student之间的巨大差距,实现了对student的更高效学习,并在CIFAR-10、CIFAR-100和ImageNet上的多个backbone架构中取得了显著的性能提升。
Abstract
With the success of
deep neural networks
,
knowledge distillation
which guides the learning of a small student network from a large teacher network is being actively studied for
→