reinforcement learning (RL) has shown empirical success in various real world
settings with complex models and large state-action spaces. The existing
analytical results, however, typically focus on settings with a small number of
state-actions or simple models such as linearly modeled