Recurrent neural network is a powerful model that learns temporal patterns in
sequential data. For a long time, it was believed that recurrent networks are
difficult to train using simple optimizers, such as stochastic gradient
descent, due to the so-called vanishing gradient problem.