部分可观测下的等变强化学习

Aug, 2024

Equivariant Reinforcement Learning under Partial Observability

Hai Nguyen, Andrea Baisero, David Klee, Dian Wang, Robert Platt...

TL;DR本研究解决了在部分可观测的环境中，机器人学习的样本效率问题。通过将特定群体对称性编码到神经网络中，提出了一种新的等变强化学习方法，使得智能体能够在相关场景中重用先前的解决方案。实验结果表明，等变智能体在样本效率和最终性能上显著优于非等变方法，具有潜在的影响力。

Abstract

Incorporating inductive biases is a promising approach for tackling challenging robot learning domains with sample-efficient solutions. This paper identifies partially observable domains where Symmetries can be a useful inductive bias for efficient learning. Specifically, by encoding t