A major thread of unsupervised domain adaptation (UDA) methods uses unlabeled data from both source and target domains to learn domain-invariant representations for adaptation. However, these methods showcase certain limitations, encouraging the use of self-supervised learning through continued pre-training. The necessity of continued pre-training or learning domain-invariant representations is still unclear in the prompt-based classification framework, where an input example is modified by a template and then fed into a language model (LM) to generate a label string. To examine this new paradigm of UDA in the prompt-based setup, we propose a frustratingly easy UDA method (FEUDA) that trains an autoregressive LM on both unlabeled and labeled examples using two different instruction-tuning tasks. Specifically, the first task trains the LM on unlabeled texts from both domains via masked language modeling (MLM), and the other uses supervised instruction-tuning on source-labeled data for classification. We conduct extensive experiments on 24 real-world domain pairs to show the effectiveness of our method over strong domain-invariant learning methods. Our analysis sheds light on why masked language modeling improves target-domain classification performance in prompt-based UDA. We discover that MLM helps the model learn both semantic and background knowledge of a domain, which are both beneficial for downstream classification.

通过在无标签数据上进行句子掩码模型训练（MLM）和源标记数据上进行监督指导训练，采用自监督学习和提示模型术语分类方法，我们提出了一种叫做困难易化领域适应（FEUDA）的方法，通过训练一个自回归语言模型，从源和目标领域的标签和无标签示例中，来学习领域不变表征，以提高目标领域的分类性能。

FEUDA: 极其简便的基于提示的无监督领域自适应