VideoTetris:走向组合式文本到视频生成Diffusion models have limitations when handling complex video generation scenarios, so VideoTetris proposes a novel framework using spatio-temporal compositional diffusion for precise T2V generation by manipulating attention maps and enhancing training data, achieving impressive results.
Jun, 2024