BriefGPT.xyz
Feb, 2024
代理人无需了解其目的
Agents Need Not Know Their Purpose
HTML
PDF
Paulo Garcia
TL;DR
确保人工智能与人类价值观保持一致被称为对齐挑战,本文描述了一种名为遗忘智能体的代理程序,该程序的行为合理,构建了设计者意图的内部近似,从而最大化了对齐,反而随着代理程序智能水平的提高使对齐的机会得到了改进。
Abstract
Ensuring
artificial intelligence
behaves in such a way that is aligned with human values is commonly referred to as the
alignment challenge
. Prior work has shown that
→