AbstractDue to the flexibility of prompting, foundation models have become the dominant force in the domains of natural language processing and image generation. With the recent introduction of the Segment Anything Model (SAM), the
prompt-driven paradigm has entered the realm of
→