In this paper, we propose a novel method for learning reward functions directly from offline demonstrations. Unlike traditional Inverse Reinforcement Learning (IRL), our approach decouples the reward function from the learner's policy, eliminating the adversarial interaction typically