There are only a few learning algorithms applicable to stochastic dynamic games. Learning in games is generally difficult because of the non-stationary environment in which each decision maker aims to learn its optimal decisions with minimal information in the presence of the other dec