The concepts of conditional mutual information (CMI) and normalized conditional mutual information (NCMI) are introduced to measure the concentration and separation performance of a classification deep neural network (DNN) in the output probability distribution space of the DNN, where CMI and the ratio between CMI and NCMI represent the intra-class concentration and inter-class separation of the DNN, respectively. By using NCMI to evaluate popular DNNs pretrained over ImageNet in the literature, it is shown that their validation accuracies over ImageNet validation data set are more or less inversely proportional to their NCMI values. Based on this observation, the standard deep learning (DL) framework is further modified to minimize the standard cross entropy function subject to an NCMI constraint, yielding CMI constrained deep learning (CMIC-DL). A novel alternating learning algorithm is proposed to solve such a constrained optimization problem. Extensive experiment results show that DNNs trained within CMIC-DL outperform the state-of-the-art models trained within the standard DL and other loss functions in the literature in terms of both accuracy and robustness against adversarial attacks. In addition, visualizing the evolution of learning process through the lens of CMI and NCMI is also advocated.

通过引入条件互信息（CMI）和归一化条件互信息（NCMI）的概念，以测量分类深度神经网络（DNN）在DNN的输出概率分布空间中的聚集和分离性能，其中CMI和CMI与NCMI之间的比率分别表示DNN的类内聚集和类间分离。通过使用NCMI来评估在文献中在ImageNet上预训练的流行DNNs，在ImageNet验证数据集上的验证准确率与其NCMI值或多或少成反比的关系得出。基于这一观察，还对标准深度学习（DL）框架进行了修改，以在NCMI约束条件下最小化标准交叉熵函数，从而得到了CMI约束深度学习（CMIC-DL）。提出了一种新颖的交替学习算法来解决这种约束优化问题。大量实验结果显示，在CMIC-DL中训练的DNN在准确性和对抗性攻击的鲁棒性方面，优于标准DL和文献中其他损失函数训练的最先进模型。此外，还提倡通过CMI和NCMI的演变来可视化学习过程。

基于条件互信息约束的深度学习分类