generative adversarial networks (GANs) generate data based on minimizing a
divergence between two distributions. The choice of that divergence is
therefore critical. We argue that the divergence must take into account the
hypothesis set and the loss function used in a subsequent learni