98%
921
2 minutes
20
Salient object detection (SOD), which is used to identify the most distinctive object in a given scene, plays an important role in computer vision tasks. Most existing RGB-D SOD methods employ a CNN-based network as the backbone to extract features from RGB and depth images; however, the inherent locality of a CNN-based network limits the performance of CNN-based methods. To tackle this issue, we propose a novel Swin Transformer-based edge guidance network (SwinEGNet) for RGB-D SOD in which the Swin Transformer is employed as a powerful feature extractor to capture the global context. An edge-guided cross-modal interaction module is proposed to effectively enhance and fuse features. In particular, we employed the Swin Transformer as the backbone to extract features from RGB images and depth maps. Then, we introduced the edge extraction module (EEM) to extract edge features and the depth enhancement module (DEM) to enhance depth features. Additionally, a cross-modal interaction module (CIM) was used to integrate cross-modal features from global and local contexts. Finally, we employed a cascaded decoder to refine the prediction map in a coarse-to-fine manner. Extensive experiments demonstrated that our SwinEGNet achieved the best performance on the LFSD, NLPR, DES, and NJU2K datasets and achieved comparable performance on the STEREO dataset compared to 14 state-of-the-art methods. Our model achieved better performance compared to SwinNet, with 88.4% parameters and 77.2% FLOPs. Our code will be publicly available.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10650861 | PMC |
http://dx.doi.org/10.3390/s23218802 | DOI Listing |
Photodiagnosis Photodyn Ther
September 2025
Department of Ophthalmology, People's Hospital of Feng Jie, Chongqing, 404600, China. Electronic address:
Objective: This study aims to develop a robust, multi-task deep learning framework that integrates vessel segmentation and radiomic analysis for the automated classification of four retinal conditions- diabetic retinopathy (DR), hypertensive retinopathy (HR), papilledema, and normal fundus-using fundus images.
Materials: AND.
Methods: A total of 2,165 patients from eight medical centers were enrolled.
Sci Rep
August 2025
Centre for Autonomous Robotic Systems, Khalifa University, Abu Dhabi, United Arab Emirates.
The precise detection and localization of abnormalities in radiological images are very crucial for clinical diagnosis and treatment planning. To build reliable models, large and annotated datasets are required that contain disease labels and abnormality locations. Most of the time, radiologists face challenges in identifying and segmenting thoracic diseases such as COVID-19, Pneumonia, Tuberculosis, and lung cancer due to overlapping visual patterns in X-ray images.
View Article and Find Full Text PDFPLoS One
August 2025
Department of Electronics and Communication Engineering, Kuwait College of Science and Technology (KCST), Doha Area, Kuwait.
Knee Ailments, such as meniscus injuries, bother millions globally, with research showing that more than 14% of the population above 40 years lives with meniscus-related conditions. Conventional diagnosis techniques, like manual MRI interpretation, are labour-intensive, error-prone, and dependent on skilled radiologists, making an automatic and more accurate alternative indispensable. Current deep-learning solutions heavily depend on CNNs, which perform poorly in long-range dependencies and global contextual info.
View Article and Find Full Text PDFSci Rep
August 2025
Origin Quantum Computing Technology (Hefei) Co., Ltd., Hefei, 230088, China.
To explore the potential of quantum computing in advancing transformer-based deep learning models for breast cancer screening, this study introduces the Quantum-Enhanced Swin Transformer (QEST). This model integrates a Variational Quantum Circuit (VQC) to replace the fully connected layer responsible for classification in the Swin Transformer architecture. In simulations, QEST exhibited competitive accuracy and generalization performance compared to the original Swin Transformer, while also demonstrating an effect in mitigating overfitting.
View Article and Find Full Text PDFVis Comput
July 2025
Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine.
Arterial Spin Labeling (ASL) perfusion MRI is the only non-invasive technique for quantifying regional cerebral blood flow (CBF) visualization, which is an important physiological variable. ASL MRI has a relatively low signal-to-noise-ratio (SNR), making it challenging to achieve high quality CBF images using limited data. Promising ASL CBF denoising results have been shown in recent convolutional neural network (CNN)-based methods.
View Article and Find Full Text PDF