pre-trained language representation models, such as BERT, capture a general
language representation from large-scale corpora, but lack domain-specific
knowledge. When reading a domain text, experts make inferences with relevant
knowledge. For machines to achieve this capability, we pro