autoregressive predictive coding (APC), as a self-supervised objective, has
enjoyed success in learning representations from large amounts of unlabeled
data, and the learned representations are rich for many downstream tasks.
However, the connection between low self-supervised loss and