We present a scene parsing method that utilizes global context information based on both the parametric and non- parametric models. Compared to previous methods that only exploit the local relationship between objects, we train a context network based on scene similarities to generate feature representations for global contexts. In addition, these learned features are utilized to generate global and spatial priors for explicit classes inference. We then design modules to embed the feature representations and the priors into the segmentation network as additional global context cues. We show that the proposed method can eliminate false positives that are not compatible with the global context representations. Experiments on both the MIT ADE20K and PASCAL Context datasets show that the proposed method performs favorably against existing methods.

本文提出了基于参数化和非参数化模型的全局语境信息来进行场景解析，使用场景相似性训练上下文网络生成特征表示来生成空间和全局先验知识，然后将这些特征表示和先验知识嵌入到分割网络作为额外的全局上下文提示。实验表明该方法可以消除与全局上下文表示不兼容的误报，且在MIT ADE20K和PASCAL Context数据集上表现良好。

利用全局上下文嵌入进行场景解析