Neural code intelligence models continue to be 'black boxes' to the human programmer. This opacity limits their application towards code intelligence tasks, particularly for applications like vulnerability detection where a model's reliance on spurious correlations can be safety-critical. We introduce a neuron-level approach to interpretability of neural code intelligence models which eliminates redundancy due to highly similar or task-irrelevant neurons within these networks. We evaluate the remaining important neurons using probing classifiers which are often used to ascertain whether certain properties have been encoded within the latent representations of neural intelligence models. However, probing accuracies may be artificially inflated due to repetitive and deterministic nature of tokens in code datasets. Therefore, we adapt the selectivity metric originally introduced in NLP to account for probe memorization, to formulate our source-code probing tasks. Through our neuron analysis, we find that more than 95\% of the neurons are redundant wrt. our code intelligence tasks and can be eliminated without significant loss in accuracy. We further trace individual and subsets of important neurons to specific code properties which could be used to influence model predictions. We demonstrate that it is possible to identify 'number' neurons, 'string' neurons, and higher level 'text' neurons which are responsible for specific code properties. This could potentially be used to modify neurons responsible for predictions based on incorrect signals. Additionally, the distribution and concentration of the important neurons within different source code embeddings can be used as measures of task complexity, to compare source-code embeddings and guide training choices for transfer learning over similar tasks.

本文提出了一种基于神经元水平的方法来解决神经代码智能模型可解释性的问题，通过去除那些高度相似或任务不相关的神经元，利用熟练的分类器评估重要的神经元，发现在我们的代码智能任务中超过95％的神经元都是冗余的，并可被删除而不会在准确性方面造成重大损失，我们进一步追踪了重要神经元的个体及子集从而发现了负责特定代码属性的'数字'，'字符串'和更高层次的'text'神经元，这些可以用于修正基于错误信号的预测神经元，并且重要神经元的分布和浓度可以作为任务复杂度的量度。

利用神经元冗余分析解释预训练源代码模型