Several recent studies have elucidated why knowledge distillation (KD) improves model performance. However, few have researched the other advantages of KD in addition to its improving model performance. In this study, we have attempted to show that KD enhances the interpretability as well as the accuracy of models. We measured the number of concept detectors identified in network dissection for a quantitative comparison of model interpretability. We attributed the improvement in interpretability to the class-similarity information transferred from the teacher to student models. First, we confirmed the transfer of class-similarity information from the teacher to student model via logit distillation. Then, we analyzed how class-similarity information affects model interpretability in terms of its presence or absence and degree of similarity information. We conducted various quantitative and qualitative experiments and examined the results on different datasets, different KD methods, and according to different measures of interpretability. Our research showed that KD models by large models could be used more reliably in various fields.

研究表明知识蒸馏不仅可以提高模型性能，还可以增强模型的可解释性。通过量化比较模型可解释性中概念探测器数量的改变，研究表明老师模型传递到学生模型的类相似信息可以提高模型可解释性。这一结论得到了通过定量和定性实验以及不同数据集、不同KD方法和不同可解释性指标的检验。结果表明，通过KD，大模型训练的模型可更可靠地用于各种领域。

知识蒸馏对模型可解释性的影响