deep neural networks have achieved impressive performance in many applications but their large number of parameters lead to significant computational and storage overheads. Several recent works attempt to mitigate these overheads by designing compact networks using pruning of connectio