Today, the dominant paradigm for training neural networks involves minimizing
task loss on a large dataset. Using world knowledge to inform a model, and yet
retain the ability to perform end-to-end training remains an open question. In
this paper, we present a novel framework for intro