Traditional Reinforcement Learning (RL) algorithms either predict rewards with value functions or maximize them using policy search. We study an alternative: upside-down reinforcement learning (Upside-Down RL or UDRL), that solves RL problems primarily using →