AbstractModern
reinforcement learning algorithms reach super-human performance in many board and video games, but they are sample inefficient, i.e. they typically require significantly more playing experience than humans to reach an equal performance level. To improve
→