Mixed-precision networks allow for a variable bit-width quantization for every layer in the network. A major limitation of existing work is that the bit-width for each layer must be predefined during training time. This allows little flexibility if the characteristics of the device on which the network is deployed change during runtime. In this work, we propose Bit-Mixer, the very first method to train a meta-quantized network where during test time any layer can change its bid-width without affecting at all the overall network's ability for highly accurate inference. To this end, we make 2 key contributions: (a) Transitional Batch-Norms, and (b) a 3-stage optimization process which is shown capable of training such a network. We show that our method can result in mixed precision networks that exhibit the desirable flexibility properties for on-device deployment without compromising accuracy. Code will be made available.

本文提出了 Bit-Mixer 方法，为高度精准预测训练多量化层的混合精度网络，在测试期间任何层都可以改变自己的比特宽度，并通过“转换批量归一化”和3阶段优化，展示了网络的训练过程以及具有理想的灵活性属性的混合精度网络可供设备部署，不会影响推断准确度。

Bit-Mixer: 运行时位宽选择的混合精度网络