Reconstructing the three-dimensional structure of a scene is a classic and fundamental problem in computer vision, but it has been revolutionized by recent progress in deep machine learning. In this paper, we survey this rich and growing area. We divide the work into four main threads: 3d reconstruction from two calibrated images from a binocular camera; 3d reconstruction from more than two images taken by the same camera or more than two calibrated cameras; object-focused 3D reconstruction with relaxed camera calibration; and SLAM-based techniques. We summarize each approach along five salient dimensions: algorithmic and deep network characteristics, output representation, datasets, and quantitative comparisons among different methods. We also discuss key challenges and future directions.
This paper's license is marked as closed access or non-commercial and cannot be viewed on ResearchHub. Visit the paper's external site.
Connect with your self-custody wallet
Connect with your Coinbase account