In this paper, we study the expressivity of scalar, Markovian reward functions in Reinforcement Learning (RL), and identify several limitations to what they can express. Specifically, we look at three classes of RL tasks; multi-objective RL, risk-sensitive RL, and modal RL. For each class, we derive necessary and sufficient conditions that describe when a problem in this class can be expressed using a scalar, Markovian reward. Moreover, we find that scalar, Markovian rewards are unable to express most of the instances in each of these three classes. We thereby contribute to a more complete understanding of what standard reward functions can and cannot express. In addition to this, we also call attention to modal problems as a new class of problems, since they have so far not been given any systematic treatment in the RL literature. We also briefly outline some approaches for solving some of the problems we discuss, by means of bespoke RL algorithms.

本文研究了强化学习中标量马尔可夫奖励函数的表达能力，并确定了其所能表达的局限性。具体而言，我们关注三类强化学习任务；多目标强化学习、风险敏感强化学习和模态强化学习。针对每个类别，我们推导出描述该类别问题可使用标量马尔可夫奖励函数的必要和充分条件。此外，我们发现标量马尔可夫奖励函数无法表达每个类别中大多数实例。因此，我们为了更全面地了解标准奖励函数能够和不能够表达的内容做出了贡献。除此之外，我们还特别提到模态问题作为一个新的问题类别，因为目前强化学习文献中还没有对其进行系统性研究。我们还简要概述了通过专门的强化学习算法解决我们讨论的某些问题的方法。

关于马尔可夫奖励在表达多目标、风险敏感和模态任务方面的限制