Modern deep learning (DL) workloads increasingly use complex Deep Reinforcement Learning (DRL) algorithms that generate training data within the learning loop. This results in programs with several nested loops and dynamic data dependencies between tensors. While DL systems with eager