TL;DR通过设计一套新的辩论协议,本文展示了如何解决 AI 安全中的挑战,其中诚实策略能够使用多项式数量的步骤来成功模拟预训练 AI 系统,同时能够验证随机 AI 系统的对齐性,即使不诚实策略允许使用指数数量的模拟步骤。
Abstract
The emergence of pre-trained ai systems with powerful capabilities across a diverse and ever-increasing set of complex domains has raised a critical challenge for ai safety as tasks can become too complicated for