Deep neural networks typically impose significant computational loads and memory consumption. Moreover, the large parameters pose constraints on deploying the model on edge devices such as embedded systems. Tensor decomposition offers a clear advantage in compressing large-scale weight tensors. Nevertheless, direct utilization of low-rank decomposition typically leads to significant accuracy loss. This paper proposes a model compression method that integrates Variational Bayesian Matrix Factorization (VBMF) with orthogonal regularization. Initially, the model undergoes over-parameterization and training, with orthogonal regularization applied to enhance its likelihood of achieving the accuracy of the original model. Secondly, VBMF is employed to estimate the rank of the weight tensor at each layer. Our framework is sufficiently general to apply to other convolutional neural networks and easily adaptable to incorporate other tensor decomposition methods. Experimental results show that for both high and low compression ratios, our compression model exhibits advanced performance.

本研究解决了深度神经网络在边缘设备上的部署受限于计算负荷和内存消耗的问题。提出了一种将变分贝叶斯矩阵分解与正交正则化相结合的模型压缩方法，显著提高了压缩模型在高低压缩比下的性能，同时保持了原始模型的准确性。这一框架的通用性使其适用于其他卷积神经网络，并可轻松整合其他张量分解方法。

基于低秩分解的卷积神经网络压缩