This paper presents a learning-based control strategy for non-linear throttle valves with an asymmetric hysteresis, leading to a near-optimal controller without requiring any prior knowledge about the environment. We start with a carefully tuned Proportional Integrator (PI) controller and exploit the recent advances in Reinforcement Learning (RL) with Guides to improve the closed-loop behavior by learning from the additional interactions with the valve. We test the proposed control method in various scenarios on three different valves, all highlighting the benefits of combining both PI and RL frameworks to improve control performance in non-linear stochastic systems. In all the experimental test cases, the resulting agent has a better sample efficiency than traditional RL agents and outperforms the PI controller.

本文提出了一种基于学习的控制策略，用于具有非对称滞后的非线性节流阀，从而实现接近最优的控制器，不需要任何关于环境的先验知识。通过仔细调整的比例积分器（PI）控制器并利用强化学习（RL）在引导策略上的最新进展，通过与阀门的额外交互学习改进闭环行为。我们在三个不同的阀门上的各种情况下测试了所提出的控制方法，所有情况都突出了结合PI和RL框架以改善非线性随机系统中的控制性能的好处。在所有实验测试用例中，所得到的代理比传统RL代理具有更好的样本效率，并且优于PI控制器。

基于强化学习改进比例积分控制器在油门阀基准上的应用