multilingual contextual embeddings, such as multilingual BERT and
XLM-RoBERTa, have proved useful for many multi-lingual tasks. Previous work
probed the cross-linguality of the representations indirectly using zero-shot
transfer learning on morphological and syntactic tasks. We instead