Scalable and effective exploration remains a key challenge in reinforcement
learning (RL). While there are methods with optimality guarantees in the
setting of discrete state and action spaces, these methods cannot be applied in
high-dimensional deep rl scenarios. As such, most contemp