This paper addresses the task of 3d pose estimation for a hand interacting
with an object from a single image observation. When modeling hand-object
interaction, previous works mainly exploit proximity cues, while overlooking
the dynamical nature that the hand must stably grasp the obj