Recent advances in large pre-trained language models have greatly improved the performance on a broad set of nlp tasks. However, adapting an existing model to new tasks often requires (repeated) re-training over enormous labeled data that is prohibitively expensive to obtain. Moreover,