Machine learning predictors are successfully deployed in applications ranging from disease diagnosis, to predicting credit scores, to image recognition. Even when the overall accuracy is high, the predictions often have systematic biases that harm specific subgroups, especially for subgroups that are minorities in the training data. We develop a rigorous fra