Recent research on human pose estimation exploits complex structures to improve performance on benchmark datasets, ignoring the resource overhead and inference speed when the model is actually deployed. In this paper, we lighten the computation cost and parameters of the deconvolution head network in SimpleBaseline and introduce an attention mechanism that utilizes original, inter-level, and intra-level information to intensify the accuracy. Additionally, we propose a novel loss function called heatmap weighting loss, which generates weights for each pixel on the heatmap that makes the model more focused on keypoints. Experiments demonstrate our method achieves a balance between performance, resource volume, and inference speed. Specifically, our method can achieve 65.3 AP score on COCO test-dev, while the inference speed is 55 FPS and 18 FPS on the mobile GPU and CPU, respectively.

本文提出了一种减轻资源负荷、加速推理速度的方法，是通过在SimpleBaseline的反卷积头网络中引入注意机制来利用原始、跨层次和内层次信息以提高精度，并采用称为heatmap加权损失的新型损失函数，生成热图上每个像素的权重，使模型更加注重关键点，实验证明我们的方法在性能和资源和推理速度之间实现了平衡，具有不错的适用性。

使用热图加权损失的轻量级人体姿态估计