98%
921
2 minutes
20
In this paper, we propose a novel method for monocular depth estimation using the hourglass neck module. The proposed method has the following originality. First, feature maps are extracted from Swin Transformer V2 using a masked image modeling (MIM) pretrained model. Since Swin Transformer V2 has a different patch size for each attention stage, it is easier to extract local and global features from images input by the vision transformer (ViT)-based encoder. Second, to maintain the polymorphism and local inductive bias of the feature map extracted from Swin Transformer V2, a feature map is input into the hourglass neck module. Third, deformable attention can be used at the waist of the hourglass neck module to reduce the computation cost and highlight the locality of the feature map. Finally, the feature map traverses the neck and proceeds through a decoder, comprised of a deconvolution layer and an upsampling layer, to generate a depth image. To evaluate the objective reliability of the proposed method in this paper, we used the NYU Depth V2 dataset to compare and evaluate the methods published in other papers. As a result of the experiment, the RMSE value of the novel method for monocular depth estimation using the hourglass neck module proposed in this paper was 0.274, which was lower than those published in other papers. The lower the RMSE value, the better the depth estimation method; therefore, its efficiency compared to other techniques has been proven.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10892898 | PMC |
http://dx.doi.org/10.3390/s24041312 | DOI Listing |
World J Clin Cases
December 2024
Department of Physical Medicine and Rehabilitation, Dongnam Institute of Radiological and Medical Sciences, Busan 46033, South Korea.
EMBO Rep
July 2024
Max Perutz Labs, Vienna Biocenter Campus (VBC), Vienna, Austria.
Junctions between the endoplasmic reticulum (ER) and the outer membrane of the nuclear envelope (NE) physically connect both organelles. These ER-NE junctions are essential for supplying the NE with lipids and proteins synthesized in the ER. However, little is known about the structure of these ER-NE junctions.
View Article and Find Full Text PDFCell Calcium
July 2024
Department of Physiology and Biophysics, School of Medicine, Universidad Autónoma de San Luis Potosí. Ave. V. Carranza 2905, Los Filtros, San Luis Potosí, SLP 78210, Mexico.
The TMEM16A channel, a member of the TMEM16 protein family comprising chloride (Cl) channels and lipid scramblases, is activated by the free intracellular Ca increments produced by inositol 1,4,5-trisphosphate (IP3)-induced Ca release after GqPCRs or Ca entry through cationic channels. It is a ubiquitous transmembrane protein that participates in multiple physiological functions essential to mammals' lives. TMEM16A structure contains two identical 10-segment monomers joined at their transmembrane segment 10.
View Article and Find Full Text PDFJ Cell Biol
May 2024
Department of Cell and Developmental Biology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA.
The septin cytoskeleton is extensively regulated by posttranslational modifications, such as phosphorylation, to achieve the diversity of architectures including rings, hourglasses, and gauzes. While many of the phosphorylation events of septins have been extensively studied in the budding yeast Saccharomyces cerevisiae, the regulation of the kinases involved remains poorly understood. Here, we show that two septin-associated kinases, the LKB1/PAR-4-related kinase Elm1 and the Nim1/PAR-1-related kinase Gin4, regulate each other at two discrete points of the cell cycle.
View Article and Find Full Text PDFSensors (Basel)
February 2024
Department of Electronic Engineering, Hanbat National University, 125, Dongseo-daero, Yuseong-gu, Daejeon 34158, Republic of Korea.
In this paper, we propose a novel method for monocular depth estimation using the hourglass neck module. The proposed method has the following originality. First, feature maps are extracted from Swin Transformer V2 using a masked image modeling (MIM) pretrained model.
View Article and Find Full Text PDF