Many popular representation-learning algorithms use training objectives defined on the observed data space, which we call pixel-level. This may be detrimental when only a small fraction of the bits of signal actually matter at a semantic level. We hypothesize that representations shoul