The off-policy paradigm casts recommendation as a counterfactual
decision-making task, allowing practitioners to unbiasedly estimate online
metrics using offline data. This leads to effective evaluation metrics, as well
as learning procedures that directly optimise online success. Neve