BriefGPT.xyz
Mar, 2024
多步反演不是你所需要的全部
Multistep Inverse Is Not All You Need
HTML
PDF
Alexander Levine, Peter Stone, Amy Zhang
TL;DR
在实际控制环境中,观测空间通常是高维度且受时间相关噪声的影响。本文研究了AC-State方法,通过学习一个编码器将观测空间映射到与控制相关的更简单的变量空间,并提出了ACDF算法用于解决AC-State方法在学习代理可控状态的潜在表示时存在的问题。通过数值模拟和神经网络编码器在高维环境中的应用,我们证明了ACDF的有效性。
Abstract
In real-world
control settings
, the
observation space
is often unnecessarily high-dimensional and subject to time-correlated noise. However, the controllable dynamics of the system are often far simpler than the
→