Traditional structured prediction models try to learn the conditional
likelihood, i.e., p(y|x), to capture the relationship between the structured
output y and the input features x. For many models, computing the likelihood is
intractable. These models are therefore hard to train, requ