continual learning is essential for real-world deployment when there is a
need to quickly adapt the model to new tasks without forgetting knowledge of
old tasks. Existing work on continual sequence generation either always reuses
existing parameters to learn new tasks, which is vulnera