convolutional neural networks (CNNs) have been shown to achieve optimal approximation and estimation error rates (in minimax sense) in several function classes. However, previously analyzed optimal CNNs are unrealistically wide and difficult to obtain via optimization due to sparse con