BriefGPT.xyz
Jun, 2019
BERT模型看什么?BERT Attention机制分析
What Does BERT Look At? An Analysis of BERT's Attention
HTML
PDF
Kevin Clark, Urvashi Khandelwal, Omer Levy, Christopher D. Manning
TL;DR
本文通过分析 BERT 的注意力机制及其输出结果,提出了新的方法并应用于其内部结构的探究,证明 Bert 的 attention heads 明显与语言的语法和指代有关,其中某些 attention heads 可以高精度地表示动词的直接宾语、名词的限定词和介词的宾语。
Abstract
Large
pre-trained neural networks
such as
bert
have had great recent success in NLP, motivating a growing body of research investigating what aspects of language they are able to learn from unlabeled data. Most r
→