在模拟人类社会中训练基于社交对齐的语言模型

May, 2023

在模拟人类社会中训练基于社交对齐的语言模型

Training Socially Aligned Language Models in Simulated Human Society

Ruibo Liu, Ruixin Yang, Chenyan Jia, Ge Zhang, Denny Zhou...

TL;DR提出一种新的LMs训练范式，让其可以从模拟社交互动中学习，从而使人工智能系统更好地符合社会规范和价值观。

Abstract

social alignment in ai systems aims to ensure that these models behave according to established societal values. However, unlike humans, who derive consensus on value judgments through social interaction, current