off-policy evaluation (OPE) is to evaluate a target policy with data
generated by other policies. Most previous OPE methods focus on precisely
estimating the true performance of a policy. We observe that in many
applications, (1) the end goal of OPE is to compare two or multiple candid