BriefGPT.xyz
Nov, 2022
知识蒸馏的课程温度
Curriculum Temperature for Knowledge Distillation
HTML
PDF
Zheng Li, Xiang Li, Lingfeng Yang, Borui Zhao, Renjie Song...
TL;DR
本文提出了一种名为CTKD的简单课程温度知识蒸馏技术,通过动态可学习的温度控制任务难度水平,实现了根据学生学习阶段逐步提高知识蒸馏难度的功能,并在CIFAR-100,ImageNet-2012和MS-COCO上进行了广泛的实验,证明了这种方法的有效性。
Abstract
Most existing distillation methods ignore the flexible role of the
temperature
in the loss function and fix it as a hyper-parameter that can be decided by an inefficient grid search. In general, the
temperature
c
→