量子化神经网络的简化部署

Sep, 2017

Streamlined Deployment for Quantized Neural Networks

Yaman Umuroglu, Magnus Jahre

TL;DR本论文介绍了一种将QNN推理操作转换为整数推理操作的流程，以及一些基于比特串处理技术的方法，以常见的按位操作有效地部署QNN。作者展示了QNN在移动CPU上的潜力，并提供了一个比特串矩阵乘法库。

Abstract

Running Deep Neural Network (DNN) models on devices with limited computational capability is a challenge due to large compute and memory requirements. quantized neural networks (QNNs) have emerged as a potential solution to this problem, promising to offer most of the DNN accuracy bene