参数可重定向决策者倾向于追求权力

Jun, 2022

参数可重定向决策者倾向于追求权力

Parametrically Retargetable Decision-Makers Tend To Seek Power

Alexander Matt Turner, Prasad Tadepalli

TL;DRAI代理程序的获取权力以及其学习策略有助于在实际环境中寻求权力，这可能会带来一些安全风险。

Abstract

If capable ai agents are generally incentivized to seek power in service of the objectives we specify for them, then these systems will pose enormous risks, in addition to enormous benefits. In fully observable environments, most reward functions have an optimal policy which seeks powe