GradMax：使用梯度信息生长神经网络

Jan, 2022

GradMax：使用梯度信息生长神经网络

GradMax: Growing Neural Networks using Gradient Information

Utku Evci, Max Vladymyrov, Thomas Unterthiner, Bart van Merriënboer, Fabian Pedregosa

TL;DR本文介绍了一种名为GradMax的技术，可以在训练期间添加新的神经元而不影响已经学到的东西，同时提高训练动态，并通过奇异值分解（SVD）高效地找到最佳初始化，从而实现了网络架构优化的目的。

Abstract

The architecture and the parameters of neural networks are often optimized independently, which requires costly retraining of the parameters whenever the architecture is modified. In this work we instead focus on growing the architecture without requiring costly retraining. We present