Standard methods for differentially private training of deep neural networks
replace back-propagated mini-batch gradients with biased and noisy
approximations to the gradient. These modifications to training often result in
a privacy-preserving model that is significantly less accurate