Causal dynamics learning has recently emerged as a promising approach to
enhancing robustness in reinforcement learning (RL). Typically, the goal is to
build a dynamics model that makes predictions based on the causal relationships
among the entities. Despite the fact that causal connections often manifest
only under certain contexts, existing approaches overlook such fine-grained
relationships and lack a detailed understanding of the dynamics. In this work,
we propose a novel dynamics model that infers fine-grained causal structures
and employs them for prediction, leading to improved robustness in RL. The key
idea is to jointly learn the dynamics model with a discrete latent variable
that quantizes the state-action space into subgroups. This leads to recognizing
meaningful context that displays sparse dependencies, where causal structures
are learned for each subgroup throughout the training. Experimental results
demonstrate the robustness of our method to unseen states and locally spurious
correlations in downstream tasks where fine-grained causal reasoning is
crucial. We further illustrate the effectiveness of our subgroup-based approach
with quantization in discovering fine-grained causal relationships compared to
prior methods.

我们提出了一种新的动力学模型，通过推断细粒度的因果结构并用于预测，以改善强化学习中的鲁棒性。该模型通过将状态 - 动作空间离散化为子群，共同学习动力学模型和离散潜变量的关键点子，从而识别显示稀疏依赖性的有意义的上下文，并在训练中为每个子群学习因果结构。实验结果证明了我们的方法在未见状态和局部干扰相关性的下游任务中显示出的鲁棒性，以及与先前方法相比，基于子群和离散化方法在发现细粒度因果关系方面的有效性。

细粒度的因果动力学学习与量化技术在增强学习中的鲁棒性改进

Fine-Grained Causal Dynamics Learning with Quantization for Improving  Robustness in Reinforcement Learning

Learning dynamics models accurately is an important goal for Model-Based
Reinforcement Learning (MBRL), but most MBRL methods learn a dense dynamics
model which is vulnerable to spurious correlations and therefore generalizes
poorly to unseen states. In this paper, we introduce Causal Dynamics Learning
for Task-Independent State Abstraction (CDL), which first learns a
theoretically proved causal dynamics model that removes unnecessary
dependencies between state variables and the action, thus generalizing well to
unseen states. A state abstraction can then be derived from the learned
dynamics, which not only improves sample efficiency but also applies to a wider
range of tasks than existing state abstraction methods. Evaluated on two
simulated environments and downstream tasks, both the dynamics model and
policies learned by the proposed method generalize well to unseen states and
the derived state abstraction improves sample efficiency compared to learning
without it.

本文介绍了一种名为 CDL 的任务无关状态抽象的因果关系动力学学习模型，它不仅从学习的动力学中产生状态抽象，而且还通过消除状态变量和动作之间的不必要依赖来提高泛化能力，并在两个模拟环境和下游任务中表现出比现有状态抽象方法更好的样本效率和对未知状态更好的泛化性能。