Jun, 2024
VideoTetris:走向组合式文本到视频生成
VideoTetris: Towards Compositional Text-to-Video Generation
TL;DRDiffusion models have limitations when handling complex video generation scenarios, so VideoTetris proposes a novel framework using spatio-temporal compositional diffusion for precise T2V generation by manipulating attention maps and enhancing training data, achieving impressive results.