Various attribution methods have been developed to explain deep neural networks (DNNs) by inferring the attribution/importance/contribution score of each input variable to the final output. However, existing attribution methods are often built upon different heuristics. There remains a lack of a unified theoretical understanding of why these methods are effective and how they are related. To this end, for the first time, we formulate core mechanisms of fourteen attribution methods, which were designed on different heuristics, into the same mathematical system, i.e., the system of Taylor interactions. Specifically, we prove that attribution scores estimated by fourteen attribution methods can all be reformulated as the weighted sum of two types of effects, i.e., independent effects of each individual input variable and interaction effects between input variables. The essential difference among the fourteen attribution methods mainly lies in the weights of allocating different effects. Based on the above findings, we propose three principles for a fair allocation of effects to evaluate the faithfulness of the fourteen attribution methods.

本文首次将诸多启发式设计的14种归因方法的核心机制，统一为一个数学系统，证明这14种方法的归因得分都可以重构为两种效应的加权求和，即每个输入变量的独立效应和输入变量之间的相互作用效应，并提出3个公平分配效应的原则来评价这14种归因方法的忠诚度。

使用Taylor相互作用理解和统一十四种归因方法