BriefGPT.xyz
Oct, 2022
从信息论的角度重新思考注意力权重作为解释
Revisiting Attention Weights as Explanations from an Information Theoretic Perspective
HTML
PDF
Bingyang Wen, K. P. Subbalakshmi, Fan Yang
TL;DR
在信息理论的角度下,论文研究了不同类型的注意力机制在保留信息和解释模型输入方面的表现,并得出了一些结论。
Abstract
attention mechanisms
have recently demonstrated impressive performance on a range of
nlp tasks
, and attention scores are often used as a proxy for model
→