前馈模型中的密度倾向

Oct, 2024

The Propensity for Density in Feed-forward Models

Nandi Schoots, Alex Jackson, Ali Kholmovaia, Peter McBurney, Murray Shanahan

TL;DR本研究旨在探讨在训练神经网络时，是否总是会利用所有可用权重，即使任务可以通过更少的权重解决。研究表明，模型的宽度对可剪枝权重比例的影响有限，且在不同大小的模型中，绝大多数权重的剪枝能力是一致的。此发现提示在各种模型规模下均存在显著的剪枝潜力，具有重要的实践意义。

Abstract

Does the process of training a neural network to solve a task tend to use all of the available weights even when the task could be solved with fewer weights? To address this question we study the effects of pruning fully connected, convolutional and Residual Models while varying their