Geometry Inspired Deep Neural Networks for 3D reconstruction
Open Access
- Author:
- Yang, Fengting
- Graduate Program:
- Information Sciences and Technology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- February 22, 2022
- Committee Members:
- Lingzhou Xue, Outside Unit & Field Member
C Lee Giles, Major Field Member
Mary Beth Rosson, Program Head/Chair
Sharon Huang, Chair & Dissertation Advisor
Zihan Zhou, Major Field Member - Keywords:
- 3D reconstruction
Geometry
Deep Neural Networks - Abstract:
- Reconstructing 3D models from given 2D images is one of the most fundamental problems in computer vision. Most traditional 3D reconstruction algorithms focus on geometric knowledge and attempt to tackle this problem with hand-craft features. However, these algorithms are usually fragile if the images contain noisy, textureless, or repetitive patterns. On the contrary, recent deep neural network-based methods rely on the image patterns and learn 3D information (e.g., depth and normal) in a data-driven manner. Without explicit geometric knowledge, these networks often suffer from performance drop once being applied to environments that are significantly different from the training ones. In this dissertation, we ask the question of whether we can address these shortcomings by combining the merits of traditional geometry-based methods and recent data-driven-based methods. To answer this question, we explore four popular 3D reconstruction tasks: (1) single-view 3D reconstruction, (2) stereo matching, (3) multi-view stereo (MVS), and (4) depth-from-focus (DFF). In each task, we take the deep neural network as the basic framework and integrate task-specific geometric knowledge into the network design. More specifically, in single-view reconstruction, we introduce plane regularity into the network and propose a structure-induced loss to train the network to recover 3D planes without supervision from ground truth plane annotation. In stereo matching, we apply the piecewise plane model to the network to better preserve object boundaries and fine details. A fully convolutional network-based superpixel segmentation approach is developed, and we incorporate it with an existing stereo matching network by considering each superpixel represents a projection of a slanted plane in the scene. In MVS, we integrate two common indoor priors into a truncated sign distance function (TSDF) regression network for indoor multi-view reconstruction. Finally, in DFF, we consider the special projective geometry of the defocus system and propose a deep differential focus volume for the DFF network. By developing these geometry-inspired networks for various tasks, we validate the effectiveness of integrating geometry with deep networks and provide an important stepping stone toward high-performance 3D reconstruction methods in multiple application settings.