reinforcement learning has been successful in many tasks ranging from robotic
control, games, energy management etc. In complex real world environments with
sparse rewards and long task horizons, sample efficiency is still a major
challenge. Most complex tasks can be easily decomposed