For many applications of probabilistic classifiers it is important that the predicted confidence vectors reflect true probabilities (one says that the classifier is calibrated). It has been shown that common models fail to satisfy this property, making reliable methods for measuring an