AbstractIn
natural language processing (NLP), pre-trained language models (LMs) that are transferred to downstream tasks have been recently shown to achieve state-of-the-art results. In this work, we extend the standard fine-tuning process of pretrained LMs by introducing a new
→