Sharing information between multiple tasks enables algorithms to achieve good
generalization performance even from small amounts of training data. However,
in a realistic scenario of multi-task learning not all tasks are equally
related to each other, hence it could be advantageous to