Adversarial attacks expose vulnerabilities of deep learning models by introducing minor perturbations to the input, which lead to substantial alterations in the output. Our research focuses on the impact of such adversarial attacks on sequence-to-sequence (seq2seq) models, specifically machine translation models. We introduce algorithms that incorporate basic text perturbation heuristics and more advanced strategies, such as the gradient-based attack, which utilizes a differentiable approximation of the inherently non-differentiable translation metric. Through our investigation, we provide evidence that machine translation models display robustness displayed robustness against best performed known adversarial attacks, as the degree of perturbation in the output is directly proportional to the perturbation in the input. However, among underdogs, our attacks outperform alternatives, providing the best relative performance. Another strong candidate is an attack based on mixing of individual characters.

深度学习模型中的对抗攻击通过对输入进行微小扰动，从而导致输出发生重大变化。我们的研究重点是这种对抗攻击对序列到序列（seq2seq）模型的影响，特别是机器翻译模型。我们引入了基本的文本扰动启发式算法和更高级的策略，例如基于梯度的攻击，该攻击利用可微分逼近的非可微分翻译度量。通过我们的调查，我们提供了证据表明机器翻译模型对已知最佳对抗攻击显示出鲁棒性，因为输出中的扰动程度与输入中的扰动成正比。然而，在次优方法中，我们的攻击方法优于其他方法，提供了最佳的相对性能。另一个有力的候选方法是基于混合单个字符的攻击。

机器翻译模型面对敌对攻击表现强劲