As a technically challenging topic, visual storytelling aims at generating an
imaginary and coherent story with narrative multi-sentences from a group of
relevant images. Existing methods often generate direct and rigid descriptions
of apparent image-based contents, because they are no