Multi-step depth enhancement refine network with multi-view stereo.

Yuxuan Ding , Kefeng Li , Guangyuan Zhang , Zhenfang Zhu , Peng Wang , Zhenfei Wang , Chen Fu , Guangchen Li , Ke Pan

PLoS One

College of Information Science and Electrical Engineering, Shandong Jiaotong University, Jinan, Shandong, China.

Published: May 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

This paper introduces an innovative multi-view stereo matching network-the Multi-Step Depth Enhancement Refine Network (MSDER-MVS), aimed at improving the accuracy and computational efficiency of high-resolution 3D reconstruction. The MSDER-MVS network leverages the potent capabilities of modern deep learning in conjunction with the geometric intuition of traditional 3D reconstruction techniques, with a particular focus on optimizing the quality of the depth map and the efficiency of the reconstruction process.Our key innovations include a dual-branch fusion structure and a Feature Pyramid Network (FPN) to effectively extract and integrate multi-scale features. With this approach, we construct depth maps progressively from coarse to fine, continuously improving depth prediction accuracy at each refinement stage. For cost volume construction, we employ a variance-based metric to integrate information from multiple perspectives, optimizing the consistency of the estimates. Moreover, we introduce a differentiable depth optimization process that iteratively enhances the quality of depth estimation using residuals and the Jacobian matrix, without the need for additional learnable parameters. This innovation significantly increases the network's convergence rate and the fineness of depth prediction.Extensive experiments on the standard DTU dataset (Aanas H, 2016) show that MSDER-MVS surpasses current advanced methods in accuracy, completeness, and overall performance metrics. Particularly in scenarios rich in detail, our method more precisely recovers surface details and textures, demonstrating its effectiveness and superiority for practical applications.Overall, the MSDER-MVS network offers a robust solution for precise and efficient 3D scene reconstruction. Looking forward, we aim to extend this approach to more complex environments and larger-scale datasets, further enhancing the model's generalization and real-time processing capabilities, and promoting the widespread deployment of multi-view stereo matching technology in practical applications.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11824967	PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0314418	PLOS

Publication Analysis

Top Keywords

multi-view stereo

multi-step depth

depth enhancement

enhancement refine

refine network

stereo matching

msder-mvs network

quality depth

depth

network

Similar Publications

Multi-view Hand Reconstruction with a Point-Embedded Transformer.

IEEE Trans Pattern Anal Mach Intell

August 2025

Lixin Yang , Licheng Zhong , Pengxiang Zhu , Xinyu Zhan , Junxiao Kong

This work introduces a novel and generalizable multi-view Hand Mesh Reconstruction (HMR) model, named POEM, designed for practical use in real-world hand motion capture scenarios. The advances of the POEM model consist of two main aspects. First, concerning the modeling of the problem, we propose embedding a static basis point within the multi-view stereo space.

View Article and Find Full Text PDF

Similar Publications

Detail-aware multi-view stereo network for depth estimation.

Appl Opt

July 2025

Haitao Tian , Junyang Li , Chenxing Wang , Helong Jiang

Multi-view stereo methods have achieved great success for depth estimation based on the coarse-to-fine depth learning frameworks; however, the existing methods perform poorly in recovering the depth of object boundaries and detail regions. To address these issues, we propose a detail-aware multi-view stereo network with a coarse-to-fine framework. The geometric depth clues hidden in the coarse stage are utilized to maintain the geometric structural relationships between object surfaces and enhance the expressive capability of image features.

View Article and Find Full Text PDF

Similar Publications

Reconstruction method for highly curved surfaces using bi-direction laser speckle.

Appl Opt

March 2025

Wanlin Pan , Yonghong Wang , Jiangxun Zhou , Huanqing Wang , Junrui Li

The inability of single-direction speckle projection to fully cover large-curvature surfaces limits the reconstruction accuracy and surface completeness. This study proposes a method for high-precision 3D reconstruction of large-curvature surfaces, using a multi-camera array combined with laser-projected speckles. A low-cost laser speckle projection device is developed to generate speckle patterns, with the optimal distance for speckle generation determined based on the relationship between the frosted glass and lenses.

View Article and Find Full Text PDF

Similar Publications

Lightweight and Accurate Multi-View Stereo With Confidence-Aware Diffusion Model.

IEEE Trans Pattern Anal Mach Intell

August 2025

Fangjinhua Wang , Qingshan Xu , Yew-Soon Ong , Marc Pollefeys

To reconstruct the 3D geometry from calibrated images, learning-based multi-view stereo (MVS) methods typically perform multi-view depth estimation and then fuse depth maps into a mesh or point cloud. To improve the computational efficiency, many methods initialize a coarse depth map and then gradually refine it in higher resolutions. Recently, diffusion models achieve great success in generation tasks.

View Article and Find Full Text PDF

Similar Publications

Fourier Lightfield Multiview Stereoscope for Large Field-of-View 3D Imaging in Microsurgical Settings.

Adv Photonics Nexus

June 2025

Duke University, Biomedical Engineering, Durham, 27708, NC, USA.

Clare B Cook , Kevin C Zhou , Martin Bohlen , Mark Harfouche , Kanghyun Kim

This work presents the Fourier Lightfield Multi-view Stereoscope (FiLM-Scope), a novel imaging device that combines concepts from Fourier Light Field Microscopy and Multi-view Stereo imaging to capture high-resolution 3D videos over large fields-of-view. The FiLM-Scope optical hardware consists of a multi-camera array, with 48 individual micro-cameras, placed behind a high-throughput primary lens. This allows the FiLM-Scope to simultaneously capture 48 unique 12.

View Article and Find Full Text PDF

Similar Publications