The training of deep neural networks (DNNs) requires intensive resources both for computation and for storage performance. Thus, DNNs cannot be efficiently applied to mobile phones and embedded devices, which seriously limits their applicability in industry applications. To address this issue, we propose a novel encoding scheme of using {-1,+1} to decompose quantized neural networks (QNNs) into multi-branch binary networks, which can be efficiently implemented by bitwise operations (xnor and bitcount) to achieve model compression, computational acceleration and resource saving. Based on our method, users can easily achieve different encoding precisions arbitrarily according to their requirements and hardware resources. The proposed mechanism is very suitable for the use of FPGA and ASIC in terms of data storage and computation, which provides a feasible idea for smart chips. We validate the effectiveness of our method on both large-scale image classification tasks (e.g., ImageNet) and object detection tasks. In particular, our method with low-bit encoding can still achieve almost the same performance as its full-precision counterparts.

为了应用DNN在移动设备中，我们提出了压缩QNN的新编码方案，使用{-1, +1}将其分解成多个二进制网络，使用位运算(xnor和bitcount)实现模型压缩、计算加速和资源节约。我们的方法非常适合在FPGA和ASIC上使用，验证了在大规模图像分类(例如ImageNet)和物体检测任务中具有与全精度相近的性能。

通过-1和+1的编码分解实现多精度量化神经网络