While current benchmark reinforcement learning (RL) tasks have been useful to
drive progress in the field, they are in many ways poor substitutes for
learning with real-world data. By testing increasingly complex RL algorithms on
low-complexity simulation environments, we often end up