Low-rank approximation techniques have become the de facto standard for
fine-tuning Large Language Models (LLMs) due to their reduced computational and
memory requirements. This paper investigates the effectiveness of these methods
in capturing the shift of fine-tuning datasets from the initial pre-trained
data distribution. Our findings reveal that there are cases in which low-rank
fine-tuning falls short in learning such shifts. This, in turn, produces
non-negligible side effects, especially when fine-tuning is adopted for
toxicity mitigation in pre-trained models, or in scenarios where it is
important to provide fair models. Through comprehensive empirical evidence on
several models, datasets, and tasks, we show that low-rank fine-tuning
inadvertently preserves undesirable biases and toxic behaviors. We also show
that this extends to sequential decision-making tasks, emphasizing the need for
careful evaluation to promote responsible LLMs development.

研究表明低秩逼近 Fine-Tuning 在捕捉 Fine-Tuning 数据集从初始预训练数据分布中的转变方面具有不足之处，会产生不可忽视的副作用，包括在针对有毒模型和提供公平模型的情景下意外保留不合理的偏差和有毒行为。此外，对于顺序决策任务，需要进行仔细评估以促进负责任的大型语言模型开发。

低秩微调 LLMs：公平视角

Low-rank finetuning for LLMs: A fairness perspective

The widespread adoption of generative image models has highlighted the urgent
need to detect artificial content, which is a crucial step in combating
widespread manipulation and misinformation. Consequently, numerous detectors
and associated datasets have emerged. However, many of these datasets
inadvertently introduce undesirable biases, thereby impacting the effectiveness
and evaluation of detectors. In this paper, we emphasize that many datasets for
AI-generated image detection contain biases related to JPEG compression and
image size. Using the GenImage dataset, we demonstrate that detectors indeed
learn from these undesired factors. Furthermore, we show that removing the
named biases substantially increases robustness to JPEG compression and
significantly alters the cross-generator performance of evaluated detectors.
Specifically, it leads to more than 11 percentage points increase in
cross-generator performance for ResNet50 and Swin-T detectors on the GenImage
dataset, achieving state-of-the-art results.
We provide the dataset and source codes of this paper on the anonymous
website: this https URL

该研究讨论了使用生成图像模型检测人工内容的紧迫性，并指出当前的数据集中存在与 JPEG 压缩和图像大小相关的偏差。研究还展示了去除这些偏差对 JPEG 压缩的鲁棒性和评估检测器的不同生成器间性能的显著影响，其中 ResNet50 和 Swin-T 检测器在 GenImage 数据集上的不同生成器间性能提高了超过 11 个百分点，达到了最先进的结果。

揭示生成图像检测数据集中的常见偏差：伪造还是 JPEG？

Fake or JPEG? Revealing Common Biases in Generated Image Detection  Datasets

Pre-trained language models and other generative models have revolutionized
NLP and beyond. However, these models tend to reproduce undesirable biases
present in their training data. Also, they may overlook patterns that are
important but challenging to capture. To address these limitations, researchers
have introduced distributional control techniques. These techniques, not
limited to language, allow controlling the prevalence (i.e., expectations) of
any features of interest in the model's outputs. Despite their potential, the
widespread adoption of these techniques has been hindered by the difficulty in
adapting complex, disconnected code. Here, we present disco, an open-source
Python library that brings these techniques to the broader public.

本文提出了一个名为 disco 的 Python 库，用于使分布式控制技术更容易为广大公众所使用，以解决现有语言模型和其他生成模型所遇到的不足和局限。