高度可并行化循环的简单循环单元

Sep, 2017

高度可并行化循环的简单循环单元

Training RNNs as Fast as CNNs

Tao Lei, Yu Zhang

TL;DR本文提出了一种轻量级循环单元（SRU），用于解决基于状态计算的传统循环神经网络因难以实现高度并行化而难以扩展的问题。SRU具有表达力强、高度可并行化、易于训练等特点，在多个自然语言处理任务上表现出色，并且在分类和问答数据集上实现了5-9倍的速度提升，优于LSTM和卷积模型。同时，将SRU引入到Transformer模型中，可以在翻译任务上平均提高0.7 BLEU分数。

Abstract

Recurrent neural networks scale poorly due to the intrinsic difficulty in parallelizing their state computations. For instance, the forward pass computation of $h_t$ is blocked until the entire computation of $h_{t-1}$ finishes, which is a major bottleneck for parallel computing. In th