The state-of-the-art multi-agent reinforcement learning (MARL) methods have provided promising solutions to a variety of complex problems. Yet, these methods all assume that agents perform synchronized primitive-action executions so that they are not genuinely scalable to long-horizon