We investigate the approximation efficiency of score functions by deep neural networks in diffusion-based generative modeling. While existing approximation theories utilize the smoothness of score functions, they suffer from the curse of dimensionality for intrinsically high-dimensional data. This limitation is pronounced in graphical models such as Markov random fields, common for image distributions, where the approximation efficiency of score functions remains unestablished. To address this, we observe score functions can often be well-approximated in graphical models through variational inference denoising algorithms. Furthermore, these algorithms are amenable to efficient neural network representation. We demonstrate this in examples of graphical models, including Ising models, conditional Ising models, restricted Boltzmann machines, and sparse encoding models. Combined with off-the-shelf discretization error bounds for diffusion-based sampling, we provide an efficient sample complexity bound for diffusion-based generative modeling when the score function is learned by deep neural networks.

利用深度神经网络来近似评分函数的效率在基于扩散的生成建模中进行了研究，我们观察到评分函数可以通过变分推断去噪算法在图模型中得到较好的近似，同时这些算法适用于高效的神经网络表示，通过示例验证了这一观察，并结合离散化误差界限为基于扩散的生成建模提供了有效的样本复杂度界限。

深度网络作为去噪算法：在高维图模型中有效学习扩散模型