ai creation, such as poem or lyrics generation, has attracted increasing
attention from both industry and academic communities, with many promising
models proposed in the past few years. Existing methods usually estimate the
outputs based on single and independent visual or textual inf