transformer-based language models (LMs) create hidden representations of
their inputs at every layer, but only use final-layer representations for
prediction. This obscures the internal decision-making process of the model and
the utility of its intermediate representations. One way to