The evaluation of deep learning models has traditionally focused on criteria such as accuracy, F1 score, and related measures. The increasing availability of high computational power environments allows the creation of deeper and more complex models. However, the computations needed to