BriefGPT.xyz
May, 2022
锐度感知训练
Sharpness-Aware Training for Free
HTML
PDF
Jiawei Du, Daquan Zhou, Jiashi Feng, Vincent Y. F. Tan, Joey Tianyi Zhou
TL;DR
本文提出了一种几乎不需要额外计算成本的Sharpness-Aware Training方法,能减少由超参导致的广义误差,该方法通过KL散度实现了一个平滑收敛点,获得了与SAM类似的效果,使训练更加高效。
Abstract
Modern
deep neural networks
(DNNs) have achieved state-of-the-art performances but are typically over-parameterized. The over-parameterization may result in undesirably large
generalization error
in the absence o
→