We establish the fundamental limits in the approximation of Lipschitz functions by deep ReLU neural networks with finite-precision weights. Specifically, three regimes, namely under-, over-, and proper quantization, in terms of minimax approximation error behavior as a function of network weight precision, are identified. This is accomplished by deriving nonasymptotic tight lower and upper bounds on the minimax approximation error. Notably, in the proper-quantization regime, neural networks exhibit memory-optimality in the approximation of Lipschitz functions. Deep networks have an inherent advantage over shallow networks in achieving memory-optimality. We also develop the notion of depth-precision tradeoff, showing that networks with high-precision weights can be converted into functionally equivalent deeper networks with low-precision weights, while preserving memory-optimality. This idea is reminiscent of sigma-delta analog-to-digital conversion, where oversampling rate is traded for resolution in the quantization of signal samples. We improve upon the best-known ReLU network approximation results for Lipschitz functions and describe a refinement of the bit extraction technique which could be of independent general interest.

我们通过导出上下界的极小极大逼近误差，确定了基于有限精度权重的深度ReLU神经网络逼近Lipschitz函数的三种情况：欠量化、过量化和适当量化。乍一看，深度网络在逼近Lipschitz函数时表现出内在的记忆有效性，与浅层网络相比，具有固有优势。此外，我们还发展了深度和精度之间的权衡概念，表明具有高精度权重的网络可以转化为具有低精度权重的功能等效更深层的网络，并保持记忆有效性。这个想法类似于sigma-delta模数转换，在信号样本的量化中超采样率与分辨率之间进行权衡。我们改进了对Lipschitz函数的ReLU网络逼近结果，并描述了一种独立通用的位提取技术的改进。

ReLU网络的三种量化模式