Enabling bipedal walking robots to learn how to maneuver over highly uneven,
dynamically changing terrains is challenging due to the complexity of robot
dynamics and interacted environments. Recent advancements in learning from
demonstrations have shown promising results for robot learning in complex
environments. While imitation learning of expert policies has been
well-explored, the study of learning expert reward functions is largely
under-explored in legged locomotion. This paper brings state-of-the-art Inverse
Reinforcement Learning (IRL) techniques to solving bipedal locomotion problems
over complex terrains. We propose algorithms for learning expert reward
functions, and we subsequently analyze the learned functions. Through nonlinear
function approximation, we uncover meaningful insights into the expert's
locomotion strategies. Furthermore, we empirically demonstrate that training a
bipedal locomotion policy with the inferred reward functions enhances its
walking performance on unseen terrains, highlighting the adaptability offered
by reward learning.

通过逆强化学习 (IRL) 技术解决复杂地形上的双足机器人行走问题，并提出用于学习专家奖励函数的算法，通过非线性函数逼近揭示专家的运动策略，同时通过训练推断奖励函数，提高双足机器人在未知地形上的行走性能。