The performance of deep (reinforcement) learning systems crucially depends on
the choice of hyperparameters. Their tuning is notoriously expensive, typically
requiring an iterative training process to run for numerous steps to
convergence. Traditional tuning algorithms only consider th