There has been an increasing interest in 3d indoor navigation, where a robot in an environment moves to a target according to an instruction. To deploy a robot for navigation in the physical world, lots of training data is required to learn an effective policy. It is quite labour inten