反对经验性人类与人工智能对齐的统计学论证

Feb, 2025

反对经验性人类与人工智能对齐的统计学论证

A Statistical Case Against Empirical Human-AI Alignment

Julian Rodemann, Esteban Garces Arias, Christoph Luther, Christoph Jansen, Thomas Augustin

TL;DR本文探讨了经验性人类与人工智能对齐所面临的统计偏见问题，指出这种方法可能会引入不必要的风险。作者提出了更为谨慎的替代方案：规范性对齐和事后经验性对齐，并通过具体例子（如以人为本的语言模型解码）来支持这一观点。该研究强调了提高人工智能系统可靠性的重要性，促使更深层次的反思和改进。

Abstract

Empirical Human-AI Alignment aims to make AI systems act in line with observed human behavior. While noble in its goals, we argue that Empirical Alignment can inadvertently introduce statistical biases that warra