long short-term memory (LSTM) has been widely used for sequential data
modeling. Researchers have increased LSTM depth by stacking LSTM cells to
improve performance. This incurs model redundancy, increases run-time delay,
and makes the LSTMs more prone to overfitting. To address these