natural language object retrieval is a highly useful yet challenging task for
robots in human-centric environments. Previous work has primarily focused on
commands specifying the desired object's type such as "scissors" and/or visual
attributes such as "red," thus limiting the robot to