Deep learning-based 3D reconstruction from multiple images: A survey

Authors

Chuhua Wang

Published

June 11, 2024

Abstract

Reconstructing the three-dimensional structure of a scene is a classic and fundamental problem in computer vision, but it has been revolutionized by recent progress in deep machine learning. In this paper, we survey this rich and growing area. We divide the work into four main threads: 3d reconstruction from two calibrated images from a binocular camera; 3d reconstruction from more than two images taken by the same camera or more than two calibrated cameras; object-focused 3D reconstruction with relaxed camera calibration; and SLAM-based techniques. We summarize each approach along five salient dimensions: algorithmic and deep network characteristics, output representation, datasets, and quantitative comparisons among different methods. We also discuss key challenges and future directions.