environmental sound scene and sound event recognition is important for the
recognition of suspicious events in indoor and outdoor environments (such as
nurseries, smart homes, nursing homes, etc.) and is a fundam
本文提出了一种基于 Audio Visual Scene Graph Segmenter (AVSGS) 的深度学习模型,通过嵌入场景的视觉结构,并将其分割为子图,实现音频源分离;同时,介绍了一个全新的数据集 Audio Separation in the Wild (ASIW),证明了该方法在音源分离方面的卓越表现。