Hyperbolic vision language representation learning on chest radiology images.

Health Inf Sci Syst

Department of Anesthesiology, Shanghai Chest Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200030 China.

Published: December 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Given the visual-semantic hierarchy between images and texts, hyperbolic embeddings have been employed for visual-semantic representation learning, leveraging the advantages of hierarchy modeling in hyperbolic space. This approach demonstrates notable advantages in zero-shot learning tasks. However, unlike general image-text alignment tasks, textual data in the medical domain often comprises complex sentences describing various conditions or diseases, posing challenges for vision language models to comprehend free-text medical reports. Consequently, we propose a novel pretraining method specifically for medical image-text data in hyperbolic space. This method uses structured radiology reports, which consist of a set of triplets, and then converts these triplets into sentences through prompt engineering. To address the challenge that diseases or symptoms generally occur in local regions, we introduce a global + local image feature extraction module. By leveraging the hierarchy modeling advantages of hyperbolic space, we employ entailment loss to model the partial order relationship between images and texts. Experimental results show that our method exhibits better generalization and superior performance compared to baseline methods in various zero-shot tasks and different datasets.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11891115PMC
http://dx.doi.org/10.1007/s13755-025-00341-xDOI Listing

Publication Analysis

Top Keywords

hyperbolic space
12
vision language
8
representation learning
8
images texts
8
hierarchy modeling
8
hyperbolic
5
hyperbolic vision
4
language representation
4
learning chest
4
chest radiology
4

Similar Publications

HC-SPA: Hyperbolic Cosine-Based Symplectic Phase Alignment for Fusion Optimization.

Sensors (Basel)

August 2025

College of Computer and Information Science, Chongqing Normal University, Chongqing 401331, China.

In multimodal collaborative learning, the gradient dynamics of heterogeneous modalities face significant challenges due to the curvature heterogeneity of parameter manifolds and mismatches in phase evolution. Traditional Euclidean optimization methods struggle to capture the complex interdependencies between heterogeneous modalities on non-Euclidean or geometrically inconsistent parameter manifolds. Furthermore, static alignment strategies often fail to suppress bifurcations and oscillatory behaviors in high-dimensional gradient flows, leading to unstable optimization trajectories across modalities.

View Article and Find Full Text PDF

In recommender systems research, the data sparsity problem has driven the development of hybrid recommendation algorithms integrating multimodal information and the application of graph neural networks (GNNs). However, conventional GNNs relying on homogeneous Euclidean embeddings fail to effectively model the non-Euclidean geometric manifold structures prevalent in real-world scenarios, consequently constraining the representation capacity for heterogeneous interaction patterns and compromising recommendation accuracy. As a consequence, the representation capability for heterogeneous interaction patterns is restricted, thereby affecting the overall representational power and recommendation accuracy of the models.

View Article and Find Full Text PDF

This study addresses the problem of generalized category discovery (GCD), an advanced and challenging semi-supervised learning scenario that deals with unlabeled data from both known and novel categories. Although recent research has effectively engaged with this issue, these studies typically map features into Euclidean space, which fails to maintain the latent semantic hierarchy of the training samples effectively. This limitation restricts the exploration of more detailed and rich information and degrades the performance in discovering new categories.

View Article and Find Full Text PDF

Anomalous topological pumping in hyperbolic lattices.

Sci Bull (Beijing)

August 2025

Key Laboratory of Advanced Optoelectronic Quantum Architecture and Measurements of Ministry of Education, Beijing Institute of Technology, Beijing 100081, China; Beijing Key Laboratory of Nanophotonics & Ultrafine Optoelectronic Systems, School of Physics, Beijing Institute of Technology, Beijing 10

Hyperbolic lattices-non-Euclidean regular tilings with constant negative curvature-provide a unique framework to explore curvature-driven topological phenomena inaccessible in flat geometries. While recent advances have focused on static hyperbolic systems, the dynamical interplay between curved space and time-modulated topology remains uncharted. Here, we study the topological pumping in hyperbolic lattices, discovering anomalous phenomena with no Euclidean analogs.

View Article and Find Full Text PDF

Optimal network geometry detection for weak geometry.

Phys Rev E

July 2025

University of Twente, Department of Electrical Engineering, Mathematics and Computer Science, Enschede, The Netherlands.

Network geometry, characterized by nodes with associated latent variables, is a fundamental feature of real-world networks. Still, when only the network edges are given, it may be difficult to assess whether the network contains an underlying source of geometry. This paper investigates the limits of geometry detection in networks in a wide class of models that contain geometry and power-law degrees, which include the popular hyperbolic random graph model.

View Article and Find Full Text PDF