BriefGPT.xyz
Oct, 2023
流形保持变换器用于短-长范围编码的有效性
Manifold-Preserving Transformers are Effective for Short-Long Range Encoding
HTML
PDF
Ayan Sengupta, Md Shad Akhtar, Tanmoy Chakraborty
TL;DR
TransJect是一种保证层间距离保持的编码器模型,通过学习将标记表示转换为具有类似拓扑结构的不同流形,并保持每对标记之间的欧几里德距离,在多个任务中展示了明显的改进。
Abstract
Multi-head
self-attention
-based
transformers
have shown promise in different learning tasks. Albeit these models exhibit significant improvement in understanding short-term and long-term contexts from sequences,
→