Collection of annotated dialogs for training task-oriented dialog systems have been one of the key bottlenecks in improving current models. While dialog response generation has been widely studied on the agent side, it is not evident if similar generative models can be used to generate a large variety of, and often unexpected, user inputs that real dialog systems encounter in practice. Existing data augmentation techniques such as paraphrase generation do not take the dialog context into consideration. In this paper, we develop a novel dialog augmentation model that generates a user turn, conditioning on full dialog context. Additionally, with a new prompt design for language model, and output re-ranking, the dialogs generated from our model can be directly used to train downstream dialog systems. On common benchmark datasets MultiWoZ and SGD, we show that our dialog augmentation model generates high quality dialogs and improves dialog success rate by as much as $8\%$ over baseline.

我们开发了一种新型的对话扩充模型，通过完整的对话上下文生成用户的回合，并通过语言模型的新提示设计和输出重新排序，所生成对话可直接用于训练下游对话系统，在常见的基准数据集MultiWoZ和SGD上，展示了我们的对话扩充模型生成高质量对话并使对话成功率较基准线提高多达8%。

面向任务导向对话系统的语境数据增强