Knowledge Distillation (KD) has been one of the most popu-lar methods to learn a compact model. However, it still suffers from highdemand in time and computational resources caused by sequential train-ing pipeline. Furthermore, the soft targets from deeper models do notoften serve as good cues for the shallower models due to the gap of com-patibility. In this work, we consider these two problems at the same time.Specifically, we propose that better soft targets with higher compatibil-ity can be generated by using a label generator to fuse the feature mapsfrom deeper stages in a top-down manner, and we can employ the meta-learning technique to optimize this label generator. Utilizing the softtargets learned from the intermediate feature maps of the model, we canachieve better self-boosting of the network in comparison with the state-of-the-art. The experiments are conducted on two standard classificationbenchmarks, namely CIFAR-100 and ILSVRC2012. We test various net-work architectures to show the generalizability of our MetaDistiller. Theexperiments results on two datasets strongly demonstrate the effective-ness of our method.

本文提出使用标签生成器以元学习技术优化生成更优软目标，然后利用中间特征图学习软目标来实现网络的自我增强，从而在网络模型压缩方面达到更好的效果，并在两个常见的基准数据集上进行了实验，结果表明该方法的有效性。

MetaDistiller: 借助元学习自我提升的自顶向下蒸馏网络