TL;DR本文拓展了Kim et al. (2016)的工作,提出了一种基于梯度的可视化技术,证明了多模式深度网络中的Hadamard乘积不仅适用于视觉输入,同时适用于文本输入,并可通过该技术可视化Hadamard乘积对视觉和文本输入的注意力机制。
Abstract
The visual explanation of learned representation of models helps to understand the fundamentals of learning. The attentional models of previous works used to visualize the attended regions over an image or text u