Unlike widely used named entity recognition (NER) data sets in generic
domains, biomedical NER data sets often contain mentions consisting of
discontinuous spans. Conventional sequence tagging techniques encode Markov
assumptions that are efficient but preclude recovery of these mentio