reinforcement learning (RL) has shown promise as a tool for engineering safe,
ethical, or legal behaviour in autonomous agents. Its use typically relies on
assigning punishments to state-action pairs that constitute unsafe or unethical
choices. Despite this assignment being a crucial s