weakly supervised video anomaly detection (WSVAD) is a challenging task.
Generating fine-grained pseudo-labels based on weak-label and then
self-training a classifier is currently a promising solution. However, since
the existing methods use only RGB visual modality and the utilization