generative classifiers offer potential advantages over their discriminative
counterparts, namely in the areas of data efficiency, robustness to data shift
and adversarial examples, and zero-shot learning (Ng and Jordan,2002; Yogatama
et al., 2017; Lewis and Fan,2019). In this paper, we