Rafael Rivera Soto, Kailin Koch, Aleem Khan, Barry Chen, Marcus Bishop...
TL;DR利用人类文本估计的写作风格来区分人类作者和机器作者,以及预测给定文档由哪个语言模型生成。
Abstract
The advent of instruction-tuned language models that convincingly mimic human
writing poses a significant risk of abuse. For example, such models could be
used for plagiarism, disinformation, spam, or phishing. However, such abuse may
be counteracted with the ability to detect whether