Selection of input features such as relevant pieces of text has become a
common technique of highlighting how complex neural predictors operate. The
selection can be optimized post-hoc for trained models or incorporated directly
into the method itself (self-explaining). However, an ove