Pruning has emerged as a promising approach for compressing large-scale models, yet its effectiveness in recovering the sparsest of models has not yet been explored. We conducted an extensive series of 485,838 experiments, applying a range of state-of-the-art pruning algorithms to a synthetic dataset we created, named the Cubist Spiral. Our findings reveal a significant gap in performance compared to ideal sparse networks, which we identified through a novel combinatorial search algorithm. We attribute this performance gap to current pruning algorithms' poor behaviour under overparameterization, their tendency to induce disconnected paths throughout the network, and their propensity to get stuck at suboptimal solutions, even when given the optimal width and initialization. This gap is concerning, given the simplicity of the network architectures and datasets used in our study. We hope that our research encourages further investigation into new pruning techniques that strive for true network sparsity.

我们通过对一个我们创建的名为Cubist Spiral的合成数据集进行了一系列485,838次的实验，研究了现有的剪枝算法在压缩大规模模型时回复最稀疏模型的有效性，并通过一种新颖的组合搜索算法发现在性能上存在明显差距。我们认为这种差距是因为现有的剪枝算法在过度参数化下的不良行为、在网络中引入不连通路径的倾向以及在给定最佳宽度和初始化的情况下倾向于陷入次优解的性质所致。鉴于我们研究中所使用的网络架构和数据集的简单性，这种差距令人担忧。我们希望我们的研究能够鼓励进一步对追求真正网络稀疏性的新型剪枝技术进行研究。

剪枝局限暴露：稀疏模型的挑战