Model-based reinforcement learning algorithms are typically more sample
efficient than their model-free counterparts, especially in sparse reward
problems. Unfortunately, many interesting domains are too complex to specify
the complete models required by traditional model-based approaches. Learning a
model takes a large number of environment samples, and may not capture critical
information if the environment is hard to explore. If we could specify an
incomplete model and allow the agent to learn how best to use it, we could take
advantage of our partial understanding of many domains. Existing hybrid
planning and learning systems which address this problem often impose highly
restrictive assumptions on the sorts of models which can be used, limiting
their applicability to a wide range of domains. In this work we propose SAGE,
an algorithm combining learning and planning to exploit a previously unusable
class of incomplete models. This combines the strengths of symbolic planning
and neural learning approaches in a novel way that outperforms competing
methods on variations of taxi world and Minecraft.

本文提出新的算法 SAGE，结合符号性规划与神经网络学习等方法，以克服传统模型的局限，更高效地解决基于模型的强化学习在处理部分了解环境时遇到的问题。该算法在出租车环境和 Minecraft 等变化场景中的表现优于其他方法。

SAGE: 深度强化学习中为近视模型生成符号化目标

SAGE: Generating Symbolic Goals for Myopic Models in Deep Reinforcement Learning

Automatic Curriculum Learning (ACL) has become a cornerstone of recent
successes in Deep Reinforcement Learning (DRL).These methods shape the learning
trajectories of agents by challenging them with tasks adapted to their
capacities. In recent years, they have been used to improve sample efficiency
and asymptotic performance, to organize exploration, to encourage
generalization or to solve sparse reward problems, among others. The ambition
of this work is dual: 1) to present a compact and accessible introduction to
the Automatic Curriculum Learning literature and 2) to draw a bigger picture of
the current state of the art in ACL to encourage the cross-breeding of existing
concepts and the emergence of new ideas.

本文介绍了自动课程学习（ACL）的相关文献，并对当前状态进行了概述，旨在促进现有概念的交叉和新思想的出现。ACL 是深度强化学习成功的中坚力量，可用于改善样本效率和渐进性能，组织探索，鼓励泛化或解决稀疏奖励问题，等等。

深度强化学习的自动课程学习：简要调查

Automatic Curriculum Learning For Deep RL: A Short Survey

Representation learning is a central challenge across a range of machine
learning areas. In reinforcement learning, effective and functional
representations have the potential to tremendously accelerate learning progress
and solve more challenging problems. Most prior work on representation learning
has focused on generative approaches, learning representations that capture all
underlying factors of variation in the observation space in a more disentangled
or well-ordered manner. In this paper, we instead aim to learn functionally
salient representations: representations that are not necessarily complete in
terms of capturing all factors of variation in the observation space, but
rather aim to capture those factors of variation that are important for
decision making -- that are "actionable." These representations are aware of
the dynamics of the environment, and capture only the elements of the
observation that are necessary for decision making rather than all factors of
variation, without explicit reconstruction of the observation. We show how
these representations can be useful to improve exploration for sparse reward
problems, to enable long horizon hierarchical reinforcement learning, and as a
state representation for learning policies for downstream tasks. We evaluate
our method on a number of simulated environments, and compare it to prior
methods for representation learning, exploration, and hierarchical
reinforcement learning.

本文研究功能性显著表征的强化学习方法，可以用于改善稀疏奖励问题的探索、实现具有长期视野的分层强化学习和作为下游任务的学习策略的状态表征。通过在多个虚拟环境中对比实验，表明该方法在表征学习、探索和分层强化学习方面具有优势。