Offensive language such as hate, abuse, and profanity (HAP) occurs in various content on the web. While previous work has mostly dealt with sentence level annotations, there have been a few recent attempts to identify offensive spans as well. We build upon this work and introduce Muted, a system to identify multilingual HAP content by displaying offensive arguments and their targets using heat maps to indicate their intensity. Muted can leverage any transformer-based HAP-classification model and its attention mechanism out-of-the-box to identify toxic spans, without further fine-tuning. In addition, we use the spaCy library to identify the specific targets and arguments for the words predicted by the attention heatmaps. We present the model's performance on identifying offensive spans and their targets in existing datasets and present new annotations on German text. Finally, we demonstrate our proposed visualization tool on multilingual inputs.

在网络上出现各种仇恨、虐待和粗鄙的言辞。我们建立了一个名为Muted的系统，通过使用热图显示攻击性论点及其目标的强度，来识别多语种的仇恨言辞内容。Muted可以利用任何基于Transformer的仇恨分类模型及其注意机制来直接识别有害片段，无需进一步微调。此外，我们使用spaCy库来识别注意热图预测的词语的具体目标和论点。我们展示了该模型在识别现有数据集中的攻击性片段及其目标方面的性能，并在德语文本上提供了新的注释。最后，我们展示了我们提出的多语种输入的可视化工具。

多语言有针对性攻击性言论识别与可视化