We consider deep neural networks in a Bayesian framework with a prior distribution sampling the network weights at random. Following a recent idea of Agapiou and Castillo (2023), who show that heavy-tailed prior distributions achieve automatic adaptation to smoothness, we introduce a simple Bayesian deep learning prior based on heavy-tailed weights and ReLU activation. We show that the corresponding posterior distribution achieves near-optimal minimax contraction rates, simultaneously adaptive to both intrinsic dimension and smoothness of the underlying function, in a variety of contexts including nonparametric regression, geometric data and Besov spaces. While most works so far need a form of model selection built-in within the prior distribution, a key aspect of our approach is that it does not require to sample hyperparameters to learn the architecture of the network. We also provide variational Bayes counterparts of the results, that show that mean-field variational approximations still benefit from near-optimal theoretical support.

我们在贝叶斯框架中考虑深度神经网络，采用随机网络权重的先验分布。根据 Agapiou 和 Castillo（2023）的最新观点表明，重尾先验分布实现了对平滑性的自适应，我们提出了一个简单的基于重尾权重和ReLU激活的贝叶斯深度学习先验。我们证明了相应的后验分布在非参数回归、几何数据和Besov空间等多种情况下实现了近乎最优的极小极小收缩率，同时对底层函数的内在维度和平滑性进行了自适应。虽然迄今为止大多数方法需要在先验分布中内置一种模型选择的形式，我们方法的一个关键方面是它不需要对网络架构进行超参数采样学习。我们还提供了结果的变分贝叶斯对应物，表明均场变分近似仍然从近乎最优的理论支持中受益。

深度神经网络的后验和变分推断与重尾权重