BriefGPT.xyz
May, 2024
离线强化学习的优化扩散策略
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning
HTML
PDF
Tianle Zhang, Jiayi Guan, Lin Zhao, Yihang Li, Dongjiang Li...
TL;DR
离线强化学习研究了优化策略的方法,使用扩散模型进行模拟,通过首选动作优化提高性能,在稀疏奖励任务中表现出竞争力或卓越性能,同时证明了抗噪声偏好优化的有效性。
Abstract
offline reinforcement learning
(RL) aims to learn optimal policies from previously collected datasets. Recently, due to their powerful representational capabilities,
diffusion models
have shown significant potent
→