自适应偏好聚合

Mar, 2025

Adaptive Preference Aggregation

Benjamin Heymann

TL;DR本文针对人工智能对齐问题，尤其是在多样化人类偏好的聚合方面，提出了一种新的解决方案。研究基于最近发布的 urn 过程，开发了一种能够适应用户情境的偏好聚合策略，继承了最大彩票的优良特性，旨在克服当前强化学习人类反馈方法的理论局限性。该工作的重要发现为改进AI系统的推荐能力提供了潜在影响。

Abstract

AI alignment, the challenge of ensuring AI systems act in accordance with human values, has emerged as a critical problem in the development of systems such as foundation models and recommender systems. Still, the current dominant approach, →