dataset bias has attracted increasing attention recently for its detrimental
effect on the generalization ability of fine-tuned models. The current
mainstream solution is designing an additional shallow model to pre-identify
biased instances. However, such two-stage methods scale up th