diffusion-based video generation models have demonstrated remarkable success
in obtaining high-fidelity videos through the iterative denoising process.
However, these models require multiple denoising steps during sampling,
resulting in high computational costs. In this work, we propos