In the realm of reinforcement learning (RL), accounting for risk is crucial for making decisions under uncertainty, particularly in applications where safety and reliability are paramount. In this paper, we introduce a general framework on Risk-Sensitive Distributional Reinforcement Learning (RS-DisRL), with static Lipschitz Risk Measures (LRM) and general function approximation. Our framework covers a broad class of risk-sensitive RL, and facilitates analysis of the impact of estimation functions on the effectiveness of RSRL strategies and evaluation of their sample complexity. We design two innovative meta-algorithms: \texttt{RS-DisRL-M}, a model-based strategy for model-based function approximation, and \texttt{RS-DisRL-V}, a model-free approach for general value function approximation. With our novel estimation techniques via Least Squares Regression (LSR) and Maximum Likelihood Estimation (MLE) in distributional RL with augmented Markov Decision Process (MDP), we derive the first $\widetilde{\mathcal{O}}(\sqrt{K})$ dependency of the regret upper bound for RSRL with static LRM, marking a pioneering contribution towards statistically efficient algorithms in this domain.

该研究介绍了一种风险敏感的分布式强化学习(RS-DisRL)框架，包括静态Lipschitz风险度量、泛函逼近等，用于分析评估RSRL策略的估计函数对其有效性和样本复杂度的影响，并设计了两种创新的元算法：面向基于模型的函数逼近的RS-DisRL-M和面向通用价值函数逼近的RS-DisRL-V。通过利用最小二乘回归(LSR)和最大似然估计(MLE)的新颖估计技术，结合增强马尔可夫决策过程(MDP)中的分布式RL，推导出了具有静态Lipschitz风险度量的RSRL的遗憾上界的首个O(√K)依赖关系，对这个领域中的统计有效算法做出了创新性贡献。

可证明的风险敏感分布式强化学习与通用函数逼近