ICMLJun, 2024
一个优秀的复制能力的全 MLP 序列建模架构
An All-MLP Sequence Modeling Architecture That Excels at Copying
Chenwei Cui, Zehao Yan, Gedeon Muhawenayo, Hannah Kerner
TL;DRCausal Relation Network (CausalRN) is an MLP sequence modeling architecture that can match Transformers on the copying task, employing key innovations such as exponentially-activated RNs, pre-activation normalization, and providing insights into strong in-context retrieval.