variational inference approximates an unnormalized distribution via the
minimization of Kullback-Leibler (KL) divergence. Although this divergence is
efficient for computation and has been widely used in applications, it suffers
from some unreasonable properties. For example, it is not