Many recently trained neural networks employ large numbers of parameters to
achieve good performance. One may intuitively use the number of parameters
required as a rough gauge of the difficulty of a problem. But how accurate are
such notions? How many parameters are really needed? In