BriefGPT.xyz
Feb, 2020
自蒸馏放大 Hilbert 空间中的正则化
Self-Distillation Amplifies Regularization in Hilbert Space
HTML
PDF
Hossein Mobahi, Mehrdad Farajtabar, Peter L. Bartlett
TL;DR
本文首次提出了对于自蒸馏现象的理论分析,研究表明,通过逐渐限制基函数的数量,自蒸馏的迭代会通过调整正则化来修改解决方案,而经过几轮的自蒸馏可以减少过拟合,但是进一步的迭代可能导致欠拟合和性能下降。
Abstract
knowledge distillation
introduced in the
deep learning
context is a method to transfer knowledge from one architecture to another. In particular, when the architectures are identical, this is called
→