Reconstructing a 3D face from visuals is crucial for digital face modeling and manipulation. Traditional methods predominantly depend on RGB images, which are susceptible to lighting variations and offer only 2D information. In contrast, depth images, resistant to lighting changes, directly capture 3D data, offering a potential solution for robust reconstructions.
Recent studies have turned to deep learning for more robust reconstruction from depth data; however, the scarcity of real depth images with accurate 3D facial labels has hindered the training process. Attempts to use auto-synthesized data for training have met limitations in generalizing to real-world scenarios due to domain disparities.
A research team, led by Xiaoxu Cai, unveiled their latest findings on 15 Feb 2024 in Frontiers of Computer Science. Their research introduces a novel domain-adaptive reconstruction method, utilizing deep learning alongside a fusion of auto-labeled synthetic and unlabeled real data. This approach facilitates the reconstruction of 3D faces from individual depth images captured in the real world.
Their method implements domain-adaptive neural networks dedicated to predicting head pose and facial shape, respectively. Each network is trained using specific strategies tailored to its component.
The head pose network is trained using a straightforward fine-tuning method, whereas a more robust adversarial domain adaptation approach is applied to train the facial shape network.
The main pipeline of the proposed 3D face reconstruction method. Credit: Xiaoxu Cai, Jianwen Lou, Jiajun Bu, Junyu Dong, Haishuai Wang, Hui Yu
Comparison with the state-of-the-art depth-based method, FDR. RGB images serve solely as visual references here and are not used as inputs in the reconstruction algorithm. Credit: Xiaoxu Cai, Jianwen Lou, Jiajun Bu, Junyu Dong, Haishuai Wang, Hui Yu.
The initial step of preprocessing involves converting pixel values from the depth image into 3D point coordinates within the camera space. This process allows the utilization of 2D convolutions in the reconstruction network for processing 3D geometric information. The network output employs 3D vertex offsets, establishing a more focused target distribution to facilitate the learning process.
The method is thoroughly evaluated on challenging real-world datasets, demonstrating its competitive performance compared to state-of-the-art techniques.
More information:
Xiaoxu Cai et al, Single depth image 3D face reconstruction via domain adaptive learning, Frontiers of Computer Science (2024). DOI: 10.1007/s11704-023-3541-7
Provided by
Higher Education Press
Post Disclaimer
The information provided in our posts or blogs are for educational and informative purposes only. We do not guarantee the accuracy, completeness or suitability of the information. We do not provide financial or investment advice. Readers should always seek professional advice before making any financial or investment decisions based on the information provided in our content. We will not be held responsible for any losses, damages or consequences that may arise from relying on the information provided in our content.