Current state-of-the-art object detectors are at the expense of high computational costs and are hard to deploy to low-end devices. knowledge distillation, which aims at training a smaller student network by transferring knowledge from a larger teacher model, is one of the promising so