AbstractAddressing such diverse ends as safety alignment with human preferences, and the efficiency of learning, a growing line of
reinforcement learning research focuses on risk functionals that depend on the entire distribution of returns. Recent work on \emph{
→