BriefGPT.xyz
Jul, 2020
具有后阶段权重的神经网络
Economical ensembles with hypernetworks
HTML
PDF
João Sacramento, Johannes von Oswald, Seijin Kobayashi, Christian Henning, Benjamin F. Grewe
TL;DR
使用随机梯度下降法训练神经网络时,通过加权平均一部分训练好的参数,可以获得更好的结果,而这种方法不会增加计算成本,可在CIFAR-10/100,ImageNet和其他测试集上得到验证。
Abstract
Averaging the predictions of many independently trained
neural networks
is a simple and effective way of improving
generalization
in deep learning. However, this strategy rapidly becomes costly, as the number of
→