The one-shot talking-head generation learns to synthesize a talking-head
video with one source portrait image under the driving of same or different
identity video. Usually these methods require plane-based pixel transformations
via Jacobin matrices or facial image warps for novel pose