In manufacturing, assembly tasks have been a challenge for learning algorithms due to variant dynamics of different environments. reinforcement learning (RL) is a promising framework to automatically learn these tasks, yet it is still not easy to apply a learned policy or skill, that i