Deep Learning Architectures employ heavy computations and bulk of the computational energy is taken up by the convolution operations in the Convolutional Neural Networks. The objective of our proposed work is to reduce the energy consumption and size of CNN for using machine learning techniques in edge computing on ubiquitous computing devices. We propose Systematic Quality Scalable Design Methodology consisting of Quality Scalable Quantization on a higher abstraction level and Quality Scalable Multipliers at lower abstraction level. The first component consists of parameter compression where we approximate representation of values in filters of deep learning models by encoding in 3 bits. A shift and scale based on-chip decoding hardware is proposed which can decode these 3-bit representations to recover approximate filter values. The size of the DNN model is reduced this way and can be sent over a communication channel to be decoded on the edge computing devices. This way power is reduced by limiting data bits by approximation. In the second component we propose a quality scalable multiplier which reduces the number of partial products by converting numbers in canonic sign digit representations and further approximating the number by reducing least significant bits. These quantized CNNs provide almost same ac-curacy as network with original weights with little or no fine-tuning. The hardware for the adaptive multipliers utilize gate clocking for reducing energy consumption during multiplications. The proposed methodology greatly reduces the memory and power requirements of DNN models making it a feasible approach to deploy Deep Learning on edge computing. The experiments done on LeNet and ConvNets show an increase upto 6% of zeros and memory savings upto 82.4919% while keeping the accuracy near the state of the art.

我们的研究旨在减少深度学习模型在边缘计算设备上的能耗和大小，通过使用机器学习技术在深度学习体系结构中的卷积操作。我们提出了系统化质量可扩展设计方法，包括较高抽象级别的质量可扩展量化和较低抽象级别的质量可扩展乘法器。这种方法通过参数压缩和质量可扩展乘法器的设计，可以减小DNN模型的大小并减少能耗，而几乎不需要微调就能保持接近原始权重网络的准确性。在LeNet和ConvNets上的实验证明，该方法在保持准确性接近最先进的同时，实现了多达6%的零元素的增加和多达82.4919%的内存节省。

边缘深度学习质量可扩展量化方法