We present two categories of model-agnostic adversarial strategies that reveal the weaknesses of several generative, task-oriented dialogue models: Should-Not-Change strategies that evaluate over-sensitivity to small and semantics-preserving edits, as well as Should-Change strategies that test if a model is over-stable against subtle yet semantics-changing modifications. We next perform adversarial training with each strategy, employing a max-margin approach for negative generative examples. This not only makes the target dialogue model more robust to the adversarial inputs, but also helps it perform significantly better on the original inputs. Moreover, training on all strategies combined achieves further improvements, achieving a new state-of-the-art performance on the original task (also verified via human evaluation). In addition to adversarial training, we also address the robustness task at the model-level, by feeding it subword units as both inputs and outputs, and show that the resulting model is equally competitive, requires only 1/4 of the original vocabulary size, and is robust to one of the adversarial strategies (to which the original model is vulnerable) even without adversarial training.

研究通过使用两类针对生成式对话模型的模型无关对抗策略：不应更改策略和应更改策略以及对它们进行对抗训练，可以使目标对话模型更加稳健，提高其性能。同时，通过在模型层次上解决了鲁棒性问题，可以使模型在扩大的词汇表下变得更具竞争力，并且即使没有对抗训练，也可以对其中一种对抗性策略具有抵御能力。

对话模型的敌对过度敏感性和过度稳定性策略