BriefGPT.xyz
Jun, 2021
知识蒸馏:好老师耐心且一致
Knowledge distillation: A good teacher is patient and consistent
HTML
PDF
Lucas Beyer, Xiaohua Zhai, Amélie Royer, Larisa Markeeva, Rohan Anil...
TL;DR
本文介绍了一种用于减小大规模计算机视觉模型尺寸、同时不影响性能的知识蒸馏方法,并且明确了影响该方法有效性的设计选择。通过全面的实验研究,我们在多种视觉数据集上获得了令人信服的结果,并实现了在ImageNet数据集上的ResNet-50模型的最新表现,其top-1准确率为82.8%。
Abstract
There is a growing discrepancy in
computer vision
between
large-scale models
that achieve state-of-the-art
performance
and models that are
→