pre-trained foundation models, owing primarily to their enormous capacity and
exposure to vast amount of training data scraped from the internet, enjoy the
advantage of storing knowledge about plenty of real-world concepts. Such models
are typically fine-tuned on downstream datasets to