Sample efficiency is critical in solving real-world reinforcement learning problems, where agent-environment interactions can be costly. imitation learning from expert advice has proved to be an effective strategy for reducing the number of interactions required to train a policy. Onli