pre-trained language model (PLM) has become a representative foundation model
in the natural language processing field. Most PLMs are trained with
linguistic-agnostic pre-training tasks on the surface form of the text, such as
the masked language model (MLM). To further empower the PLM