reinforcement learning can solve decision-making problems and train an agent
to behave in an environment according to a predesigned reward function.
However, such an approach becomes very problematic if the reward is too sparse
and the agent does not come across the reward during the e