A simple monocular depth estimation network for balancing complexity and accuracy.

Xuanxuan Liu , Shuai Tang , Mengdie Feng , Xueqi Guo , Yanru Zhang , Yan Wang

Sci Rep

Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, 518000, Guangdong, Shenzhen, China.

Published: April 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Monocular depth estimation plays a crucial role in many downstream visual tasks. Although research on monocular depth estimation is relatively mature, it commonly involves strategies that entail increasing both the computational complexity and the number of parameters to achieve superior performance. Particularly in practical applications, enhancing the accuracy of depth prediction while ensuring computational efficiency remains a challenging issue. To tackle this challenge, we propose a novel and simple depth estimation model called SimMDE, which treats monocular depth estimation as an ordinal regression problem. Beginning with a baseline encoder, our model is equipped with a Deformable Cross-Attention Feature Fusion (DCF) decoder with sparse attention. This decoder efficiently integrates multi-scale feature maps, markedly reducing the quadratic complexity of the Transformer model. For the extraction of finer local features, we propose a Local Multi-dimensional Convolutional Attention (LMC) module. Meanwhile, we propose a Wavelet Attention Transformer (WAT) module to achieve pixel-level precise classification of images. Furthermore, we also conduct extensive experiments on two widely recognized depth estimation benchmark datasets: NYU and KITTI. The experimental findings unequivocally demonstrate that our model attains exceptional accuracy in depth estimation while upholding high computational efficiency. Remarkably, our framework SimMDE, extending from AdaBins, demonstrates enhancements, resulting in substantial improvements of 11.7% and 10.3% in the absolute relative error (AbsRel) on the NYU and KITTI datasets, respectively, with fewer parameters.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11997066	PMC
http://dx.doi.org/10.1038/s41598-025-97568-1	DOI Listing

Publication Analysis

Top Keywords

depth estimation

monocular depth

depth

accuracy depth

computational efficiency

nyu kitti

estimation

simple monocular

estimation network

network balancing

Similar Publications

Acceptability and effectiveness of empathy-based provider training and community-level awareness activities on self-injectable contraceptive use in Niger, Lagos, and Oyo States, Nigeria: a mixed methods program evaluation.

BMC Womens Health

September 2025

Society for Family Health-Nigeria, Abuja, Nigeria.

Susan Ontiri , Claire W Rothschild , Fauzia Tariq , Oluwaseun Adeleke , Michael Titus

Background: Interventions aimed to increase healthcare provider empathy and capacity to deliver person-centered care have been shown to improve healthcare seeking and outcomes. In the context of self-injectable contraception, empathetic counseling and coaching may be promising approaches for addressing "fear of the needle" among clients interested in using subcutaneous depot medroxyprogesterone (DMPA-SC). In Nigeria, the Delivering Innovation in Self-Care (DISC) project developed and evaluated an empathy-based in-service training and supportive supervision intervention for public sector family (FP) planning providers implemented in conjunction with community-based mobilization.

View Article and Find Full Text PDF

Similar Publications

Single camera estimation of microswimmer depth with a convolutional network.

J R Soc Interface

September 2025

Institute of Intelligent Systems and Robotics, Sorbonne Université, Paris, Île-de-France, France.

Ali Hosseini , Célia Fosse , Maya Awada , Marcel Stimberg , Romain Brette

A number of techniques have been developed to measure the three-dimensional trajectories of protists, which require special experimental set-ups, such as a pair of orthogonal cameras. On the other hand, machine learning techniques have been used to estimate the vertical position of spherical particles from the defocus pattern, but they require the acquisition of a labelled dataset with finely spaced vertical positions. Here, we describe a simple way to make a dataset of images labelled with vertical position from a single 5 min movie, based on a tilted slide set-up.

View Article and Find Full Text PDF

Similar Publications

HIPSTR: highest independent posterior subtree reconstruction in TreeAnnotator X.

Bioinformatics

September 2025

Institute of Ecology and Evolution, University of Edinburgh, Edinburgh, United Kingdom.

Guy Baele , Luiz M Carvalho , Marius Brusselmans , Gytis Dudas , Xiang Ji

Summary: In Bayesian phylogenetic and phylodynamic studies it is common to summarise the posterior distribution of trees with a time-calibrated summary phylogeny. While the maximum clade credibility (MCC) tree is often used for this purpose, we here show that a novel summary tree method-the highest independent posterior subtree reconstruction, or HIPSTR-contains consistently higher supported clades over MCC. We also provide faster computational routines for estimating both summary trees in an updated version of TreeAnnotator X, an open-source software program that summarizes the information from a sample of trees and returns many helpful statistics such as individual clade credibilities contained in the summary tree.

View Article and Find Full Text PDF

Similar Publications

Directed message passing neural networks enhanced graph convolutional learning for accurate polymer density prediction.

J Chem Phys

September 2025

National Synchrotron Radiation Laboratory, State Key Laboratory of Advanced Glass Materials, Anhui Provincial Engineering Research Center for Advanced Functional Polymer Films, University of Science and Technology of China, Hefei, Anhui 230029, China.

Shenyang Sun , Fucheng Tian , Chenhao Zhao , Mengyu Xie , Wenyi Li

Polymer density is a critical factor influencing material performance and industrial applications, and it can be tailored by modifying the chemical structure of repeating units. Traditional polymer density characterization methods rely heavily on domain expertise; however, the vast chemical space comprising over one million potential polymer structures makes conventional experimental screening inefficient and costly. In this study, we proposed a machine learning framework for polymer density prediction, rigorously evaluating four models: neural networks (NNs), random forest (RF), XGBoost, and graph convolutional neural networks (GCNNs).

View Article and Find Full Text PDF

Similar Publications

Evolution of Motor and Nonmotor Characteristics in an Idiopathic/Isolated REM Sleep Behavior Disorder Cohort.

Neurology

October 2025

Montreal Neurological Institute-Hospital, McGill University, Montreal, Canada.

Aline Delva , Seyed-Mohammad Fereshtehnejad , Andrew Vo , Chun William Yao , Amélie Pelletier

Background And Objectives: Years before diagnosis of Parkinson disease (PD), dementia with Lewy bodies (DLB), or multiple system atrophy (MSA), mild prodromal manifestations can be detected. Longitudinal follow-up of people with prodromal synucleinopathy, particularly idiopathic/isolated REM sleep behavior disorder (iRBD), enables in-depth clinical phenotyping of early disease, which could facilitate stratification for clinical trials, provide the definition of appropriate end points, or predict phenoconversion more precisely. The aim of this study was to update and expand on previous studies assessing clinical evolution from iRBD to clinically diagnosed disease, up to 14 years before diagnosis.

View Article and Find Full Text PDF

Similar Publications