Dual aggregation based joint-modal similarity hashing for cross-modal retrieval.

Le Xu , Jun Yin

Neural Netw

Shanghai Maritime University, Shanghai, 201306, China. Electronic address:

Published: September 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Cross-modal hashing aims to leverage hashing functions to map multimodal data into a unified low-dimensional space, realizing efficient cross-modal retrieval. In particular, unsupervised cross-modal hashing methods attract significant attention for not needing external label information. However, in the field of unsupervised cross-modal hashing, there are several pressing issues to address: (1) how to facilitate semantic alignment between modalities, and (2) how to effectively capture the intrinsic relationships between data, thereby constructing a more reliable affinity matrix to assist in the learning of hash codes. In this paper, Dual Aggregation-Based Joint-modal Similarity Hashing (DAJSH) is proposed to overcome these challenges. To enhance cross-modal semantic alignment, we employ a Transformer encoder to fuse image and text features and introduce a contrastive loss to optimize cross-modal consistency. Additionally, for constructing a more reliable affinity matrix to assist hash code learning, we propose a dual-aggregation affinity matrix construction scheme. This scheme integrates intra-modal cosine similarity and Euclidean distance while incorporating cross-modal similarity, thereby maximally preserving cross-modal semantic information. Experimental results demonstrate that our method achieves performance improvements of 1.9 % ∼ 5.1 %, 0.9 % ∼ 5.8 % and 0.6 % ∼ 2.6 % over state-of-the-art approaches on the MIR Flickr, NUS-WIDE and MS COCO benchmark datasets, respectively.

Download full-text PDF	Source
http://dx.doi.org/10.1016/j.neunet.2025.108069	DOI Listing

Publication Analysis

Top Keywords

cross-modal hashing

affinity matrix

cross-modal

joint-modal similarity

similarity hashing

cross-modal retrieval

unsupervised cross-modal

semantic alignment

constructing reliable

reliable affinity

Similar Publications

Dual aggregation based joint-modal similarity hashing for cross-modal retrieval.

Neural Netw

September 2025

Shanghai Maritime University, Shanghai, 201306, China. Electronic address:

Le Xu , Jun Yin

View Article and Find Full Text PDF

Similar Publications

Asymmetric and Discrete Self-Representation Enhancement Hashing for Cross-Domain Retrieval.

IEEE Trans Image Process

January 2025

Jiaxing Li , Lin Jiang , Xiaozhao Fang , Shengli Xie , Yong Xu

Due to the characteristics of low storage requirement and high retrieval efficiency, hashing-based retrieval has shown its great potential and has been widely applied for information retrieval. However, retrieval tasks in real-world applications are usually required to handle the data from various domains, leading to the unsatisfactory performances of existing hashing-based methods, as most of them assuming that the retrieval pool and the querying set are similar. Most of the existing works overlooked the self-representation that containing the modality-specific semantic information, in the cross-modal data.

View Article and Find Full Text PDF

Similar Publications

LADDA: Latent Diffusion-based Domain-adaptive Feature Disentangling for Unsupervised Multi-modal Medical Image Registration.

IEEE J Biomed Health Inform

July 2025

Peng Yuan , Jianmin Dong , Wei Zhao , Fei Lyu , Cheng Xue

Deformable image registration (DIR) is critical for accurate clinical diagnosis and effective treatment planning. However, patient movement, significant intensity differences, and large breathing deformations hinder accurate anatomical alignment in multi-modal image registration. These factors exacerbate the entanglement of anatomical and modality-specific style information, thereby severely limiting the performance of multi-modal registration.

View Article and Find Full Text PDF

Similar Publications

Fast Partial-Modal Online Cross-Modal Hashing.

IEEE Trans Image Process

January 2025

Fengling Li , Yang Sun , Tianshi Wang , Lei Zhu , Xiaojun Chang

Cross-Modal Hashing (CMH) has become a powerful technique for large-scale cross-modal retrieval, offering benefits like fast computation and efficient storage. However, most CMH models struggle to adapt to streaming multimodal data in real-time once deployed. Although recent online CMH studies have made progress in this area, they often overlook two key challenges: 1) learning effectively from streaming partial-modal multimodal data, and 2) avoiding the high costs associated with frequent hash function re-training and large-scale updates to database hash codes.

View Article and Find Full Text PDF

Similar Publications

[Cross modal medical image online hash retrieval based on online semantic similarity].

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi

April 2025

School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China.

Qinghai Liu , Lun Tang , Qianlin Wu , Liming Xu , Qianbin Chen

Online hashing methods are receiving increasing attention in cross modal medical image retrieval research. However, existing online methods often lack the learning ability to maintain semantic correlation between new and existing data. To this end, we proposed online semantic similarity cross-modal hashing (OSCMH) learning framework to incrementally learn compact binary hash codes of medical stream data.

View Article and Find Full Text PDF

Similar Publications