Multi-View Echocardiographic Embedding for Accessible AI Development.

Takeshi Tohyama , Ahram Han , Dukyong Yoon , Kenneth Paik , Brian Gow , Nura Izath , Jacques Kpodonu , Leo Anthony Celi

medRxiv

Published: August 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Background And Aims: Echocardiography serves as a cornerstone of cardiovascular diagnostics through multiple standardized imaging views. While recent AI foundation models demonstrate superior capabilities across cardiac imaging tasks, their massive computational requirements and reliance on large-scale datasets create accessibility barriers, limiting AI development to well-resourced institutions. Vector embedding approaches offer promising solutions by leveraging compact representations from original medical images for downstream applications. Furthermore, demographic fairness remains critical, as AI models may incorporate biases that confound clinically relevant features. We developed a multi-view encoder framework to address computational accessibility while investigating demographic fairness challenges.

Methods: We utilized the MIMIC-IV-ECHO dataset (7,169 echocardiographic studies) to develop a transformer-based multi-view encoder that aggregates view-level representations into study-level embeddings. The framework incorporated adversarial learning to suppress demographic information while maintaining clinical performance. We evaluated performance across 21 binary classification tasks encompassing echocardiographic measurements and clinical diagnoses, comparing against foundation model baselines with varying adversarial weights.

Results: The multi-view encoder achieved a mean improvement of 9.0 AUC points (12.0% relative improvement) across clinical tasks compared to foundation model embeddings. Performance remained robust with limited echocardiographic views compared to the conventional approach. However, adversarial learning showed limited effectiveness in reducing demographic shortcuts, with stronger weighting substantially compromising diagnostic performance.

Conclusions: Our framework democratizes advanced cardiac AI capabilities, enabling substantial diagnostic improvements without massive computational infrastructure. While algorithmic approaches to demographic fairness showed limitations, the multi-view encoder provides a practical pathway for broader AI adoption in cardiovascular medicine with enhanced efficiency in real-world clinical settings.

Structured Graphical Abstract Or Graphical Abstract: Can multi-view encoder frameworks achieve superior diagnostic performance compared to foundation model embeddings while reducing computational requirements and maintaining robust performance with fewer echocardiographic views for cardiac AI applications? Multi-view encoder achieved 12.0% relative improvement (9.0 AUC points) across 21 cardiac tasks compared to foundation model baselines, with efficient 512-dimensional vector embeddings and robust performance using fewer echocardiographic views. Vector embedding approaches with attention-based multi-view integration significantly improve cardiac diagnostic performance while reducing computational requirements, offering a pathway toward more efficient AI implementation in clinical settings. Our proposed multi-view encoder framework overcomes critical barriers to the widespread adoption of artificial intelligence in echocardiography. By dramatically reducing computational requirements, the multi-view encoder approach allows smaller healthcare institutions to develop sophisticated AI models locally. The framework maintains robust performance with fewer echocardiographic examinations, which addresses real-world clinical constraints where comprehensive imaging is not feasible due to patient factors or time limitations. This technology provides a practical way to democratize advanced cardiac AI capabilities, which could improve access to cardiovascular care across diverse healthcare settings while reducing dependence on proprietary datasets and massive computational resources.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12393585	PMC
http://dx.doi.org/10.1101/2025.08.15.25333725	DOI Listing

Publication Analysis

Top Keywords

multi-view encoder

computational requirements

foundation model

massive computational

demographic fairness

compared foundation

echocardiographic views

reducing computational

robust performance

performance fewer

Similar Publications

Multi-View Echocardiographic Embedding for Accessible AI Development.

medRxiv

August 2025

Takeshi Tohyama , Ahram Han , Dukyong Yoon , Kenneth Paik , Brian Gow

View Article and Find Full Text PDF

Similar Publications

DeepHDAC3i: Leveraging an Interpretable Deep Learning-based Framework for the Accelerated Discovery of HDAC3 Inhibitors.

IEEE Trans Comput Biol Bioinform

August 2025

Saeed Ahmed , Nalini Schaduangrat , Ittipat Meewan , Watshara Shoombuatong

Epigenetics encompasses dynamic and reversible modifications that regulate gene activity without altering the underlying DNA sequence. Epigenetic processes, including non-coding RNA interactions, and DNA methylation regulate patterns of gene expression by responding to cellular signaling, environmental stimuli, and developmental cues. The balance of histone acetylation is maintained by histone deacetylase (HDAC) and histone acetyltransferase (HAT) activities.

View Article and Find Full Text PDF

Similar Publications

Captioner: Improving change captioning by leveraging momentum cross-view and cross-modality contrastive learning.

Neural Netw

August 2025

National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu, 610064, PR China; College of Computer Science, Sichuan University, Chengdu, 610065, PR China. Electronic address:

Lin Deng , Borui Kang , Yuzhong Zhong , Maoning Wang , Jianwei Zhang

The primary goal of change captioning is to identify subtle visual differences between two similar images and express them in natural language. Existing research has been significantly influenced by the task of vision change detection and has mainly concentrated on the identification and description of visual changes. However, we contend that an effective change captioner should go beyond mere detection and description of what has changed.

View Article and Find Full Text PDF

Similar Publications

Multi-view parallel convolutional network for organ segmentation in mediastinal region on CT images.

Neural Netw

August 2025

College of Mechanical and Electrical Engineering, Northeast Forestry University, Harbin, 150040, China.

Yining Xie , Wei Zhou , Jiayi Ma , Fengjiao Wang , Jing Zhao

In lung CT images, mediastinal organ segmentation is crucial for localizing different mediastinal regions. However, existing medical image segmentation methods exhibit significant limitations in modeling the diverse topological structures of organs, sensitivity to intra-class morphological variations, and inter-class feature differentiation. To address these limitations, we propose a novel multi-view parallel convolutional network (MVPCNet), built on an efficient U-shaped encoder-decoder framework.

View Article and Find Full Text PDF

Similar Publications

Communication-Efficient Federated Multi-View Clustering.

IEEE Trans Pattern Anal Mach Intell

August 2025

Jiyuan Liu , Xinwang Liu , Siqi Wang , Xinhang Wan , Dongsheng Li

Federated multi-view clustering is an emerging machine learning paradigm that groups the data with each view distributed on an isolated client while preserving their privacies. Although recent researches have proposed a few feasible solutions, they are severely limited by two drawbacks. In specific, the clients are required to share their data representations at each iteration of model training, leading to heavy communication overhead.

View Article and Find Full Text PDF

Similar Publications