Existing counting methods often adopt regression-based approaches and thus cannot precisely localize the target objects, which hinders the further applications and analysis (e.g., high-level understanding and fine-grained classification). In addition, most of prior work mainly focus on counting objects in the static environments with fixed cameras. Motivated