98%
921
2 minutes
20
Social images are often associated with rich but noisy tags from community contributions. Although social tags can potentially provide valuable semantic training information for image retrieval, existing studies all fail to effectively filter noises by exploiting the cross-modal correlation between image content and tags. The current cross-modal vision-and-language representation learning methods, which selectively attend to the relevant parts of the image and text, show a promising direction. However, they are not suitable for social image retrieval since: (1) they deal with natural text sequences where the relationships between words can be easily captured by language models for cross-modal relevance estimation, while the tags are isolated and noisy; (2) they take (image, text) pair as input, and consequently cannot be employed directly for unimodal social image retrieval. This paper tackles the challenge of utilizing cross-modal interactions to learn precise representations for unimodal retrieval. The proposed framework, dubbed CGVR (Cross-modal Guided Visual Representation), extracts accurate semantic representations of images from noisy tags and transfers this ability to image-only hashing subnetwork by a carefully designed training scheme. To well capture correlated semantics and filter noises, it embeds a priori common-sense relationship among tags into attention computation for joint awareness of textual and visual context. Experiments show that CGVR achieves approximately 8.82 and 5.45 points improvement in MAP over the state-of-the-art on two widely used social image benchmarks. CGVR can serve as a new baseline for the image retrieval community. The code is provided at https://github.com/zhaowanqing/CGVR.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2024.3519112 | DOI Listing |
J Eval Clin Pract
September 2025
Department of Orthopedics and Traumatology, Medical Faculty, University of Health Sciences, Antalya, Turkey.
Aims And Objective: The field of medical statistics has experienced significant advancements driven by integrating innovative statistical methodologies. This study aims to conduct a comprehensive analysis to explore current trends, influential research areas, and future directions in medical statistics.
Methods: This paper maps the evolution of statistical methods used in medical research based on 4,919 relevant publications retrieved from the Web of Science.
J Feline Med Surg
September 2025
Department for Small Animals, Veterinary Faculty, Leipzig University, Leipzig, Germany.
ObjectivesThe objective of this study was to evaluate the occurrence of voltage-gated potassium channel (VGKC) antibodies and the pattern of MRI changes in cats with complex partial seizures with orofacial involvement (CPSOFI), as well as to investigate whether there are factors influencing survival that could be used as prognostic markers in those cats.MethodsCats with CPSOFI were identified retrospectively. The following data were retrieved from the hospital database: signalment, age at first seizure and presentation, the presence of antibodies against VGKC (leucine-rich glioma inactivating factor 1 (LGI1), contactin-associated protein 2 (CASPR2)) and cerebrospinal fluid (CSF) analysis findings.
View Article and Find Full Text PDFJMIR Form Res
September 2025
Department of Medical Science, Asan Medical Institute of Convergence Science and Technology, University of Ulsan College of Medicine, 88 Olympic-ro 43-gil, Asan Medical Center, Seoul, 05505, Republic of Korea.
Background: Opportunistic computed tomography (CT) screening for the evaluation of sarcopenia and myosteatosis has been gaining emphasis. A fully automated artificial intelligence (AI)-integrated system for body composition assessment on CT scans is a prerequisite for effective opportunistic screening. However, no study has evaluated the implementation of fully automated AI systems for opportunistic screening in real-world clinical practice for routine health check-ups.
View Article and Find Full Text PDFFront Oncol
August 2025
Department of Hepatobiliary Surgery, The Second Hospital of Hebei Medical University, Shijiazhuang, Hebei, China.
Background: Early diagnosis can significantly improve survival rate of Pancreatic ductal adenocarcinoma (PDAC), but due to the insidious and non-specific early symptoms, most patients are not suitable for surgery when diagnosed. Traditional imaging techniques and an increasing number of non-imaging diagnostic methods have been used for the early diagnosis of pancreatic cancer (PC) through deep learning (DL).
Objective: This review summarizes diagnosis methods for pancreatic cancer with the technique of deep learning and looks forward to the future development directions of deep learning for early diagnosis of pancreatic cancer.
Multivariate pattern analysis (MVPA) methods are a versatile tool to retrieve information from neurophysiological data obtained with functional magnetic resonance imaging (fMRI) techniques. Since fMRI is based on measuring the hemodynamic response following neural activation, the spatial specificity of the fMRI signal is inherently limited by contributions of macrovascular compartments that drain the signal from the actual location of neural activation, making it challenging to image cortical structures at the spatial scale of cortical columns and layers. By relying on information from multiple voxels, MVPA has shown promising results in retrieving information encoded in fine-grained spatial patterns.
View Article and Find Full Text PDF