With the demanding need for deploying dialogue systems in new domains with less cost, zero-shot dialogue state tracking (DST), which tracks user's requirements in task-oriented dialogues without training on desired domains, draws attention increasingly. Although prior works have leveraged question-answering (QA) data to reduce the need for in-domain training in DST, they fail to explicitly model knowledge transfer and fusion for tracking dialogue states. To address this issue, we propose CoFunDST, which is trained on domain-agnostic QA datasets and directly uses candidate choices of slot-values as knowledge for zero-shot dialogue-state generation, based on a T5 pre-trained language model. Specifically, CoFunDST selects highly-relevant choices to the reference context and fuses them to initialize the decoder to constrain the model outputs. Our experimental results show that our proposed model achieves outperformed joint goal accuracy compared to existing zero-shot DST approaches in most domains on the MultiWOZ 2.1. Extensive analyses demonstrate the effectiveness of our proposed approach for improving zero-shot DST learning from QA.

本研究提出 CoFunDST，使用域不可知的 QA 数据集培训，通过 T5 预训练语言模型基于候选选项直接生成零样本对话状态。与现有的零样本 DST 方法相比，在 MultiWOZ 2.1 的大多数领域中，我们的模型具有更好的联合目标准确性。

选择融合作为零样本对话状态跟踪的知识