With the rise of individual and collaborative networks of autonomous agents, AI is deployed in more key reasoning and decision-making roles. For this reason, ethics-based audits play a pivotal role in the rapidly growing fields of AI safety and regulation. This paper undertakes an ethics-based audit to probe the 8 leading commercial and open-source Large Language Models including GPT-4. We assess explicability and trustworthiness by a) establishing how well different models engage in moral reasoning and b) comparing normative values underlying models as ethical frameworks. We employ an experimental, evidence-based approach that challenges the models with ethical dilemmas in order to probe human-AI alignment. The ethical scenarios are designed to require a decision in which the particulars of the situation may or may not necessitate deviating from normative ethical principles. A sophisticated ethical framework was consistently elicited in one model, GPT-4. Nonetheless, troubling findings include underlying normative frameworks with clear bias towards particular cultural norms. Many models also exhibit disturbing authoritarian tendencies. Code is available at https://github.com/jonchun/llm-sota-chatbots-ethics-based-audit.

通过进行基于伦理的审计，该研究评估了8个主要的商业和开源大型语言模型（包括GPT-4）的可解释性和可信度，比较不同模型在道德推理和伦理框架上的规范价值，以探索人工智能与人类的伦理对齐问题。研究结果表明，GPT-4表现出了一个复杂的伦理框架，但同时也显示出对特定文化规范存在明显偏见的规范框架和令人不安的威权主义倾向。

Informed AI Regulation: Comparing the Ethical Frameworks of Leading LLM
  Chatbots Using an Ethics-Based Audit to Assess Moral Reasoning and Normative
  Values

知情人工智能监管：通过伦理审核比较领先的LLM聊天机器人的伦理框架，评估道德推理和规范价值