We propose an exploration method that incorporates look-ahead search over
basic learnt skills and their dynamics, and use it for reinforcement learning
(RL) of manipulation policies . Our skills are multi-goal policies learned in
isolation in simpler environments using existing multigo