Understanding the 3D world is a fundamental problem in computer vision. However, learning a good representation of 3D objects is still an open problem due to the high dimensionality of the data and many factors of variation involved. In this work, we investigate the task of single-view 3D object reconstruction from a learning agent's perspective. We formulate the learning process as an interaction between 3D and 2D representations and propose an encoder-decoder network with a novel projection loss defined by the perspective transformation. More importantly, the projection loss enables the unsupervised learning using 2D observation without explicit 3D supervision. We demonstrate the ability of the model in generating 3D volume from a single 2D image with three sets of experiments: (1) learning from single-class objects; (2) learning from multi-class objects and (3) testing on novel object classes. Results show superior performance and better generalization ability for 3D object reconstruction when the projection loss is involved.

从学习代理的角度探究了单视角3D物体重建任务，提出了一种包含透视变换定义的新型投影损失的编码器-解码器网络，实现了从单个2D图像生成3D体积的无监督学习，并通过实验证明了投影损失提高了3D对象重建的性能和泛化能力。

透视变换网络：学习单视图三维物体重建，无需三维监督