Improving the generalization ability of modern deep neural networks (DNNs) is a fundamental challenge in machine learning. Two branches of methods have been proposed to seek flat minima and improve generalization: one led by sharpness-aware minimization (SAM) minimizes the worst-case neighborhood loss through adversarial weight perturbation (AWP), and the other minimizes the expected Bayes objective with random weight perturbation (RWP). While RWP offers advantages in computation and is closely linked to AWP on a mathematical basis, its empirical performance has consistently lagged behind that of AWP. In this paper, we revisit the use of RWP for improving generalization and propose improvements from two perspectives: i) the trade-off between generalization and convergence and ii) the random perturbation generation. Through extensive experimental evaluations, we demonstrate that our enhanced RWP methods achieve greater efficiency in enhancing generalization, particularly in large-scale problems, while also offering comparable or even superior performance to SAM. The code is released at https://github.com/nblt/mARWP.

通过对随机权重扰动的目标进行最小化以提高泛化能力的研究表明，改进深度神经网络（DNNs）的泛化能力是机器学习中的一个基本挑战。通过两个分支方法提出了分别由锋利度感知最小化（SAM）和随机权重扰动（RWP）引导的方法，通过对最差情况邻域损失进行最小化来改进泛化能力。然而，RWP在计算上具有优势，并在数学基础上与AWP密切相关，但其实证效果始终落后于AWP。本文重访RWP的使用方式，并从两个角度提出改进策略：i）泛化和收敛性之间的权衡，ii）随机扰动生成。通过大量实验评估，我们证明了我们增强的RWP方法在提升泛化能力方面具有更高的效率，特别是在大规模问题上，并且在性能方面与SAM相当，甚至更优越。

重新审视随机参数扰动以有效地提升泛化能力