BriefGPT.xyz
Nov, 2023
价值基点:将大型语言模型映射到基本人类价值的多维谱系
Value FULCRA: Mapping Large Language Models to the Multidimensional Spectrum of Basic Human Values
HTML
PDF
Jing Yao, Xiaoyuan Yi, Xiting Wang, Yifan Gong, Xing Xie
TL;DR
本研究提出了一种基本价值对准范式,并在基本价值维度上构建了一个价值空间,通过识别潜在价值将所有大型语言模型的行为映射到该空间,以解决对其负有责任的发展中的三个挑战。
Abstract
The rapid advancement of
large language models
(LLMs) has attracted much attention to
value alignment
for their responsible development. However, how to define values in this context remains a largely unexplored
→