Training directed neural networks typically requires forward-propagating data
through a computation graph, followed by backpropagating error signal, to
produce weight updates. All layers, or more generally, modules, of the network
are therefore locked, in the sense that they must wait