policy gradient methods have enabled deep reinforcement learning (RL) to
approach challenging continuous control problems, even when the underlying
systems involve highly nonlinear dynamics that generate complex non-smooth
optimization landscapes. We develop a rigorous framework for un