In text classification, the traditional attention mechanisms usually focus
too much on frequent words, and need extensive labeled data in order to learn.
This paper proposes a perturbation-based self-supervised attention approach to
guide attention learning without any annotation overh