Recent advancements in dialogue response selection (DRS) are based on the \textit{task-adaptive pre-training (TAP)} approach, by first initializing their model with BERT~\cite{devlin-etal-2019-bert}, and adapt to dialogue data with dialogue-specific or fine-grained pre-training tasks. However, it is uncertain whether BERT is the best initialization choice, or whether the proposed dialogue-specific fine-grained learning tasks are actually better than MLM+NSP. This paper aims to verify assumptions made in previous works and understand the source of improvements for DRS. We show that initializing with RoBERTa achieve similar performance as BERT, and MLM+NSP can outperform all previously proposed TAP tasks, during which we also contribute a new state-of-the-art on the Ubuntu corpus. Additional analyses shows that the main source of improvements comes from the TAP step, and that the NSP task is crucial to DRS, different from common NLU tasks.

本研究旨在验证先前论文中提出的关于初始化选择的假设和理解DRS改进的来源，研究表明使用RoBERTa初始化的性能与BERT类似，而MLM+NSP可以优于先前提出的所有TAP任务，并且NSP任务对于DRS非常重要，与常见的NLU任务不同，通过TAP步骤是DRS改进的主要来源。

对话响应选择任务自适应预训练