The goal in offline data-driven decision-making is synthesize decisions that
optimize a black-box utility function, using a previously-collected static
dataset, with no active interaction. These problems appear in many forms:
offline reinforcement learning (RL), where we must produce a