BriefGPT.xyz
Aug, 2023
鸟瞰视角下的视觉语言导航场景图
Bird's-Eye-View Scene Graph for Vision-Language Navigation
HTML
PDF
Rui Liu, Xiaohan Wang, Wenguan Wang, Yi Yang
TL;DR
利用BEV场景图编码室内环境的场景布局和几何线索以解决视觉语言导航中对于三维场景几何和全景观察选择的限制,该方法在REVERIE、R2R和R4R数据集上显著优于现有方法,展示了BEV感知在视觉语言导航中的潜力。
Abstract
vision-language navigation
(VLN), which entails an agent to navigate 3D environments following human instructions, has shown great advances. However, current agents are built upon
panoramic observations
, which hi
→