BriefGPT.xyz
Jun, 2019
可视化和测量 BERT 的几何形状
Visualizing and Measuring the Geometry of BERT
HTML
PDF
Andy Coenen, Emily Reif, Ann Yuan, Been Kim, Adam Pearce...
TL;DR
本文描述了一种特别有效的模型BERT,它能够通过从语义和句法子空间中提取一般有用的语言特征来代表语言信息,同时还探讨了注意力矩阵和单词嵌入中的句法表示,并提出了一种数学证明来解释这些表示的几何形态。
Abstract
transformer architectures
show significant promise for natural language processing. Given that a single
pretrained model
can be fine-tuned to perform well on many different tasks, these networks appear to extract
→