Swin Transformer-Based Edge Guidance Network for RGB-D Salient Object Detection.

Sensors (Basel)

Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China.

Published: October 2023


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Salient object detection (SOD), which is used to identify the most distinctive object in a given scene, plays an important role in computer vision tasks. Most existing RGB-D SOD methods employ a CNN-based network as the backbone to extract features from RGB and depth images; however, the inherent locality of a CNN-based network limits the performance of CNN-based methods. To tackle this issue, we propose a novel Swin Transformer-based edge guidance network (SwinEGNet) for RGB-D SOD in which the Swin Transformer is employed as a powerful feature extractor to capture the global context. An edge-guided cross-modal interaction module is proposed to effectively enhance and fuse features. In particular, we employed the Swin Transformer as the backbone to extract features from RGB images and depth maps. Then, we introduced the edge extraction module (EEM) to extract edge features and the depth enhancement module (DEM) to enhance depth features. Additionally, a cross-modal interaction module (CIM) was used to integrate cross-modal features from global and local contexts. Finally, we employed a cascaded decoder to refine the prediction map in a coarse-to-fine manner. Extensive experiments demonstrated that our SwinEGNet achieved the best performance on the LFSD, NLPR, DES, and NJU2K datasets and achieved comparable performance on the STEREO dataset compared to 14 state-of-the-art methods. Our model achieved better performance compared to SwinNet, with 88.4% parameters and 77.2% FLOPs. Our code will be publicly available.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10650861PMC
http://dx.doi.org/10.3390/s23218802DOI Listing

Publication Analysis

Top Keywords

swin transformer-based
8
transformer-based edge
8
edge guidance
8
guidance network
8
salient object
8
object detection
8
rgb-d sod
8
cnn-based network
8
backbone extract
8
extract features
8

Similar Publications

Objective: This study aims to develop a robust, multi-task deep learning framework that integrates vessel segmentation and radiomic analysis for the automated classification of four retinal conditions- diabetic retinopathy (DR), hypertensive retinopathy (HR), papilledema, and normal fundus-using fundus images.

Materials: AND.

Methods: A total of 2,165 patients from eight medical centers were enrolled.

View Article and Find Full Text PDF

The precise detection and localization of abnormalities in radiological images are very crucial for clinical diagnosis and treatment planning. To build reliable models, large and annotated datasets are required that contain disease labels and abnormality locations. Most of the time, radiologists face challenges in identifying and segmenting thoracic diseases such as COVID-19, Pneumonia, Tuberculosis, and lung cancer due to overlapping visual patterns in X-ray images.

View Article and Find Full Text PDF

Knee Ailments, such as meniscus injuries, bother millions globally, with research showing that more than 14% of the population above 40 years lives with meniscus-related conditions. Conventional diagnosis techniques, like manual MRI interpretation, are labour-intensive, error-prone, and dependent on skilled radiologists, making an automatic and more accurate alternative indispensable. Current deep-learning solutions heavily depend on CNNs, which perform poorly in long-range dependencies and global contextual info.

View Article and Find Full Text PDF

Quantum integration in swin transformer mitigates overfitting in breast cancer screening.

Sci Rep

August 2025

Origin Quantum Computing Technology (Hefei) Co., Ltd., Hefei, 230088, China.

To explore the potential of quantum computing in advancing transformer-based deep learning models for breast cancer screening, this study introduces the Quantum-Enhanced Swin Transformer (QEST). This model integrates a Variational Quantum Circuit (VQC) to replace the fully connected layer responsible for classification in the Swin Transformer architecture. In simulations, QEST exhibited competitive accuracy and generalization performance compared to the original Swin Transformer, while also demonstrating an effect in mitigating overfitting.

View Article and Find Full Text PDF

Arterial Spin Labeling (ASL) perfusion MRI is the only non-invasive technique for quantifying regional cerebral blood flow (CBF) visualization, which is an important physiological variable. ASL MRI has a relatively low signal-to-noise-ratio (SNR), making it challenging to achieve high quality CBF images using limited data. Promising ASL CBF denoising results have been shown in recent convolutional neural network (CNN)-based methods.

View Article and Find Full Text PDF