BriefGPT.xyz
Jan, 2020
塑造行为的激励
The Incentives that Shape Behaviour
HTML
PDF
Ryan Carey, Eric Langlois, Tom Everitt, Shane Legg
TL;DR
形式化了代理随决策而控制的变量及响应的变量的激励机制,并演示了在任何单一决策因果影响图中,检测这些激励机制的独特图形标准;引入了结构因果影响模型,它是影响图和结构因果模型框架的混合体;最后,说明了这些激励机制如何预测公正和人工智能安全应用中的代理激励。
Abstract
Which variables does an agent have an incentive to control with its
decision
, and which variables does it have an incentive to respond to? We formalise these
incentives
, and demonstrate unique graphical criteria
→