In this work, we attempt to answer a critical question: whether there exists
some input sequence that will cause a well-trained discrete-space neural
network sequence-to-sequence (seq2seq) model to generate egregious outputs
(aggressive, malicious, attacking, etc.). And if such inputs exist, how to find
them efficiently. We adopt an empirical methodology, in