Most existing deep reinforcement learning (DRL) frameworks consider either
discrete action space or continuous action space solely. Motivated by
applications in computer games, we consider the scenario with
discrete-continuous hybrid action space. To handle hybrid action space,
previous works either approximate the hybrid space by discretization, or relax
it into a continuous set. In this paper, we propose a parametrized deep
Q-network (P- DQN) framework for the hybrid action space without approximation
or relaxation. Our algorithm combines the spirits of both DQN (dealing with
discrete action space) and DDPG (dealing with continuous action space) by
seamlessly integrating them. Empirical results on a simulation example, scoring
a goal in simulated RoboCup soccer and the solo mode in game King of Glory
(KOG) validate the efficiency and effectiveness of our method.

在本文中，我们提出了一种带参数的深度 Q 网络（P-DQN）框架，用于处理混合行动空间，此算法无需任何逼近或者弛豫，充分发挥 DQN 和 DDPG 精神，并且在 RoboCup 足球和王者荣耀游戏中的仿真实验证明了我们方法的有效性。

参数化深度 Q-Networks 学习：离散 - 连续混合动作空间增强学习

Parametrized Deep Q-Networks Learning: Reinforcement Learning with  Discrete-Continuous Hybrid Action Space

Recent work has shown that deep neural networks are capable of approximating
both value functions and policies in reinforcement learning domains featuring
continuous state and action spaces. However, to the best of our knowledge no
previous work has succeeded at using deep neural networks in structured
(parameterized) continuous action spaces. To fill this gap, this paper focuses
on learning within the domain of simulated RoboCup soccer, which features a
small set of discrete action types, each of which is parameterized with
continuous variables. The best learned agent can score goals more reliably than
the 2012 RoboCup champion agent. As such, this paper represents a successful
extension of deep reinforcement learning to the class of parameterized action
space MDPs.

该论文研究了在 RoboCup 足球模拟领域中使用深度强化学习中的深度神经网络来处理参数化连续动作空间，成功地拓展了深度强化学习到参数化行动空间 MDPs 的类别，并比 2012 RoboCup 冠军代理更可靠地得分。