从稳态到资源共享：生物和经济兼容的多目标多智能体AI安全基准

Sep, 2024

从稳态到资源共享：生物和经济兼容的多目标多智能体AI安全基准

From homeostasis to resource sharing: Biologically and economically compatible multi-objective multi-agent AI safety benchmarks

HTML

PDF

Roland Pihlakas, Joel Pyykkö

TL;DR本研究解决了目前AI安全领域中对人类价值观的自动化经验测试缺乏的问题。通过引入稳态和资源共享等生物和经济动机主题，本文展示了现代强化学习文献中在安全性方面被忽视的多个目标和平衡的必要性。研究成果表明，当前主流AI安全讨论存在显著不足，需进一步完善相关基准。

Abstract

Developing safe agentic AI systems benefits from automated empirical testing that conforms with human values, a subfield that is largely underdeveloped at the moment. To contribute towards this topic, present work focuses on introducing biologically and economically motivated themes that have been neglected in the safety aspects of modern →