Reinforcement-learning agents seek to maximize a reward signal through
environmental interactions. As humans, our contribution to the learning process
is through designing the reward function. Like programmers, we have a behavior
in mind and have to translate it into a formal specification, namely rewards.
In this work, we consider the reward-design problem