encoder-decoder based Sequence to Sequence learning (S2S) has made remarkable
progress in recent years. Different network architectures have been used in the
encoder/decoder. Among them, Convolutional Neural Networks (CNN) and Self
Attention Networks (SAN) are the prominent ones. The t