It remains a challenge to efficiently extract spatialtemporal information
from skeleton sequences for 3D human action recognition. Although most recent
action recognition methods are based on Recurrent Neural Networks which present
outstanding performance, one of the shortcomings of these methods is the
tendency to overemphasize the temporal information. Sin