Building agents that can explore their environments intelligently is a
challenging open problem. In this paper, we make a step towards understanding
how a hierarchical design of the agent's policy can affect its exploration
capabilities. First, we design EscapeRoom environments, where the agent must
figure out how to navigate to the exit by accomplishing a n