TL;DR我们研究了从单一图像中推断 3D 物体中心场景表示的方法,并通过分开建模内外在属性来改善现有的无监督学习方法的局限性,使其能够从稀疏的现实世界图像中无监督地学习高保真的物体中心场景表示。
Abstract
We study inferring 3d object-centric scene representations from a single
image. While recent methods have shown potential in unsupervised 3D object
discovery from simple synthetic images, they fail to generalize to real-world
scenes with visually rich and diverse objects. This limitati