Meta reinforcement learning (Meta RL) has been amply explored to quickly
learn an unseen task by transferring previously learned knowledge from similar
tasks. However, most state-of-the-art algorithms require the meta-training
tasks to have a dense coverage on the task distribution and a great amount of
data for each of them. In this paper, we propose MetaDreamer, a context-based
Meta RL algorithm that requires less real training tasks and data by doing
meta-imagination and MDP-imagination. We perform meta-imagination by
interpolating on the learned latent context space with disentangled properties,
as well as MDP-imagination through the generative world model where physical
knowledge is added to plain VAE networks. Our experiments with various
benchmarks show that MetaDreamer outperforms existing approaches in data
efficiency and interpolated generalization.

MetaDreamer 是一种基于上下文的元强化学习算法，通过元想象和 MDP 想象来减少实际训练任务和数据的需求，在学习未知任务时通过从类似任务中转移先前学习的知识，实验结果表明 MetaDreamer 在数据效率和插值推广方面胜过现有的方法。