Bernhard Bermeitinger, Tomas Hrycej, Siegfried Handschuh
TL;DR通过研究深度神经网络中的残差连接,提出了一种平行浅层架构的替代方案,通过在 Taylor 级数表达式中截断高阶项,发现广而浅的网络架构在性能上与传统的深层架构相当,这一发现有望简化网络架构、提高优化效率并加速训练过程。
Abstract
deep neural networks have a good success record and are thus viewed as the
best architecture choice for complex applications. Their main shortcoming has
been, for a long time, the vanishing gradient which prevented the numerical
optimization algorithms from acceptable convergence. A br