BriefGPT.xyz
Dec, 2023
无先验掩码:简化深度强化学习中的冗余动作
No Prior Mask: Eliminate Redundant Action for Deep Reinforcement Learning
HTML
PDF
Dianyu Zhong, Yiqin Yang, Qianchuan Zhao
TL;DR
通过分析理论并提出一个新的冗余动作过滤机制,我们展示了一种用于策略优化的简单而高效的方法,它通过估算状态分布之间的距离构建相似度因子,并结合修改后的逆模型来避免在高维状态空间中进行大量计算。我们在高维、像素输入和随机问题上进行了广泛实验,证明了我们方法的卓越性能。
Abstract
The large action space is one fundamental obstacle to deploying
reinforcement learning
methods in the real world. The numerous
redundant actions
will cause the agents to make repeated or invalid attempts, even le
→