deep neural networks (DNNs) are powerful models that have achieved excellent
performance on difficult learning tasks. Although DNNs work well whenever large
labeled training sets are available, they cannot be used to map sequences to
sequences. In this paper, we present a general end-t