In distributed learning, the goal is to perform a learning task over data
distributed across multiple nodes with minimal (expensive) communication. Prior
work (Daume III et al., 2012) proposes a general model that bounds the
communication required for learning classifiers while allowin