Global Guided Cross-Modal Cross-Scale Network for RGB-D Salient Object Detection.

Shuaihui Wang , Fengyi Jiang , Boqian Xu

Sensors (Basel)

Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China.

Published: August 2023

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

RGB-D saliency detection aims to accurately localize salient regions using the complementary information of a depth map. Global contexts carried by the deep layer are key to salient objection detection, but they are diluted when transferred to shallower layers. Besides, depth maps may contain misleading information due to the depth sensors. To tackle these issues, in this paper, we propose a new cross-modal cross-scale network for RGB-D salient object detection, where the global context information provides global guidance to boost performance in complex scenarios. First, we introduce a global guided cross-modal and cross-scale module named GCMCSM to realize global guided cross-modal cross-scale fusion. Then, we employ feature refinement modules for progressive refinement in a coarse-to-fine manner. In addition, we adopt a hybrid loss function to supervise the training of GCMCSNet over different scales. With all these modules working together, GCMCSNet effectively enhances both salient object details and salient object localization. Extensive experiments on challenging benchmark datasets demonstrate that our GCMCSNet outperforms existing state-of-the-art methods.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10459329	PMC
http://dx.doi.org/10.3390/s23167221	DOI Listing

Publication Analysis

Top Keywords

cross-modal cross-scale

salient object

global guided

guided cross-modal

cross-scale network

network rgb-d

rgb-d salient

object detection

global

salient

Similar Publications

TwinsTNet: Broad-View Twins Transformer Network for Bi-Modal Salient Object Detection.

IEEE Trans Image Process

May 2025

Pengfei Lyu , Xiaosheng Yu , Jianning Chi , Hao Wu , Chengdong Wu

Exploring complementary information between RGB and thermal/depth modalities is crucial for bi-modal salient object detection (BSOD). However, the distinct characteristics of different modalities often lead to large differences in information distributions. Existing models, which rely on convolutional operations or plug-and-play attention mechanisms, struggle to address this issue.

View Article and Find Full Text PDF

Similar Publications

RDCRNet: RGB-T Object Detection Network Based on Cross-Modal Representation Model.

Entropy (Basel)

April 2025

The College of Electronic and Information Engineering, Changchun University of Science and Technology, Changchun 130022, China.

Yubin Li , Weida Zhan , Yichun Jiang , Jinxin Guo

RGB-thermal object detection harnesses complementary information from visible and thermal modalities to enhance detection robustness in challenging environments, particularly under low-light conditions. However, existing approaches suffer from limitations due to their heavy dependence on precisely registered data and insufficient handling of cross-modal distribution disparities. This paper presents RDCRNet, a novel framework incorporating a Cross-Modal Representation Model to effectively address these challenges.

View Article and Find Full Text PDF

Similar Publications

CCGL-YOLOV5:A cross-modal cross-scale global-local attention YOLOV5 lung tumor detection model.

Comput Biol Med

October 2023

School of Medical Information and Engineering, Ningxia Medical University, Yinchuan, 750004, China. Electronic address:

Tao Zhou , Fengzhen Liu , Xinyu Ye , Hongwei Wang , Huiling Lu

Background: Multimodal medical image detection is a key technology in medical image analysis, which plays an important role in tumor diagnosis. There are different sizes lesions and different shapes lesions in multimodal lung tumor images, which makes it difficult to effectively extract key features of lung tumor lesions.

Methods: A Cross-modal Cross-scale Clobal-Local Attention YOLOV5 Lung Tumor Detection Model (CCGL-YOLOV5) is proposed in this paper.

View Article and Find Full Text PDF

Similar Publications

Global Guided Cross-Modal Cross-Scale Network for RGB-D Salient Object Detection.

Sensors (Basel)

August 2023

Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China.

Shuaihui Wang , Fengyi Jiang , Boqian Xu

View Article and Find Full Text PDF

Similar Publications

Identifying drug-target interactions via heterogeneous graph attention networks combined with cross-modal similarities.

Brief Bioinform

March 2022

School of Information Science and Technology, Northeast Normal University, Jingyue Street, 130117, Changchun, China.

Lu Jiang , Jiahao Sun , Yue Wang , Qiao Ning , Na Luo

Accurate identification of drug-target interactions (DTIs) plays a crucial role in drug discovery. Compared with traditional experimental methods that are labor-intensive and time-consuming, computational methods are more and more popular in recent years. Conventional computational methods almost simply view heterogeneous networks which integrate diverse drug-related and target-related dataset instead of fully exploring drug and target similarities.

View Article and Find Full Text PDF

Similar Publications