Many widely used datasets for graph machine learning tasks have generally been homophilous, where nodes with similar labels connect to each other. Recently, new graph neural networks (GNNs) have been developed that move beyond the homophily regime; however, their evaluation has often b