We study Nash equilibria learning of a general-sum stochastic game with an
unknown transition probability density function. Agents take actions at the
current environment state and their joint action influences the transition of
the environment state and their immediate rewards. Each a