Proper confidence calibration of deep neural networks is essential for reliable predictions in safety-critical tasks. Miscalibration can lead to model over-confidence and/or under-confidence; i.e., the model's confidence in its prediction can be greater or less than the model's accuracy. Recent studies have highlighted the over-confidence issue by introducing calibration techniques and demonstrated success on various tasks. However, miscalibration through under-confidence has not yet to receive much attention. In this paper, we address the necessity of paying attention to the under-confidence issue. We first introduce a novel metric, a miscalibration score, to identify the overall and class-wise calibration status, including being over or under-confident. Our proposed metric reveals the pitfalls of existing calibration techniques, where they often overly calibrate the model and worsen under-confident predictions. Then we utilize the class-wise miscalibration score as a proxy to design a calibration technique that can tackle both over and under-confidence. We report extensive experiments that show our proposed methods substantially outperforming existing calibration techniques. We also validate our proposed calibration technique on an automatic failure detection task with a risk-coverage curve, reporting that our methods improve failure detection as well as trustworthiness of the model. The code are available at \url{https://github.com/AoShuang92/miscalibration_TS}.

深度神经网络的适当置信度校准对于安全关键任务中的可靠预测至关重要。近期的研究强调了校准技术引入的置信度过高问题，并成功在各种任务上展示了其成果。然而，置信度过低问题尚未得到足够重视。本文首先引入了一种新的指标，即校准错误评分，用于识别整体和类别上的校准状态，包括置信度过高或过低。我们的指标揭示了现有校准技术存在的缺陷，它们往往过度校准模型，并加剧了置信度过低的预测问题。接着，我们利用类别上的校准错误评分作为代理设计了一种既能应对置信度过高又能应对置信度过低的校准技术。我们进行了大量实验证明我们提出的方法明显优于现有的校准技术。我们还通过风险覆盖曲线在自动故障检测任务上验证了我们的校准技术，结果表明我们的方法提高了故障检测的性能和模型的可信度。可在https://github.com/AoShuang92/miscalibration_TS找到代码。

错配的两面：识别网络校准中的过度自信和不足自信预测