BriefGPT.xyz
Feb, 2020
基于时间卷积关注的序列模型网络
Temporal Convolutional Attention-based Network For Sequence Modeling
HTML
PDF
Hongyan Hao, Yan Wang, Yudi Xia, Jian Zhao, Furao Shen
TL;DR
我们提出了一种基于时间卷积网络和注意力机制的探索性架构称为TCAN,它不仅能够实现递归网络的近似替代,还可以吸收前向模型的优势,提高了word-level PTB、character-level PTB和WikiText-2等文本数据集的bpc/perplexity表现.
Abstract
With the development of
feed-forward models
, the default model for
sequence modeling
has gradually evolved to replace recurrent networks. Many powerful
→