Transfer learning prediction of type 2 diabetes with unpaired clinical and genetic data.

YounSung Jung , SeanKyo Han , EunHee Kang , SoYoung Park , MinHee Kim , NanHee Kim , TaeJin Ahn

Sci Rep

Department of Life Science, Handong Global University, Pohang, Republic of Korea.

Published: July 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

The prevalence of type 2 diabetes mellitus (T2DM) in Korea has risen in recent years, yet many cases remain undiagnosed. Advanced artificial intelligence models using multi-modal data have shown promise in disease prediction, but two major challenges persist: the scarcity of samples containing all desired data modalities and class imbalance in T2DM datasets. We propose a novel transfer learning framework to predict T2DM onset within five years, using two Korean cohorts (KoGES and SNUH). To utilize unpaired multi-modal data, our approach transfers knowledge between clinical and genetic domains, leveraging unpaired clinical data alongside paired data. We also address class imbalance by applying a positively weighted binary cross-entropy (BCE) loss and a weighted random sampler (WRS). The transfer learning framework improved T2DM prediction performance. Using WRS and weighted BCE loss increased the model's balanced accuracy and AUC (achieving test AUC 0.8441). Furthermore, combining transfer learning with intermediate data fusion yielded even higher performance (test AUC 0.8715). These enhancements were achieved despite limited paired multi-modal samples. Our framework effectively handles scarce paired data and class imbalance, leading to improved T2DM risk prediction. This approach can be adapted to other medical prediction tasks and integrated with additional data modalities, potentially aiding earlier diagnosis and better disease management in clinical settings.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12307585	PMC
http://dx.doi.org/10.1038/s41598-025-05532-w	DOI Listing

Publication Analysis

Top Keywords

transfer learning

class imbalance

data

type diabetes

unpaired clinical

clinical genetic

multi-modal data

data modalities

learning framework

paired data

Similar Publications

Letter to editor about "Utilizing explainable machine learning for progression-free survival prediction in high-grade serous ovarian cancer: insights from a prospective cohort study".

Int J Surg

September 2025

Shenzhen Traditional Chinese Medicine Hospital, The Fourth Clinical Medical College of Guangzhou University of Chinese Medicine, Shenzhen, People's Republic of China.

Mengying Bai , Wenbo Wu , Yuehui Zheng

View Article and Find Full Text PDF

Similar Publications

Unveiling molecular signatures for precision drug design: machine learning insights from trypanothione reductase, PKC-θ, and CB1.

Mol Divers

September 2025

Department of Biotechnology, National Institute of Technology Raipur, Raipur, Chhattisgarh, 492001, India.

Sunil Sahu , Adarsh Anmol , Tushar Nishad , Satya Eswari Jujjavarapu

Traditional drug discovery methods like high-throughput screening and molecular docking are slow and costly. This study introduces a machine learning framework to predict bioactivity (pIC₅₀) and identify key molecular properties and structural features for targeting Trypanothione reductase (TR), Protein kinase C theta (PKC-θ), and Cannabinoid receptor 1 (CB1) using data from the ChEMBL database. Molecular fingerprints, generated via PaDEL-Descriptor and RDKit, encoded structural features as binary vectors.

View Article and Find Full Text PDF

Similar Publications

Oral bioavailability property prediction based on task similarity transfer learning.

Mol Divers

September 2025

Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, 211198, China.

Chen Zeng , Chengcheng Xu , Yingxu Liu , Yunya Jiang , Lidan Zheng

Drug absorption significantly influences pharmacokinetics. Accurately predicting human oral bioavailability (HOB) is essential for optimizing drug candidates and improving clinical success rates. The traditional method based on experiment is a common way to obtain HOB, but the experimental method is time-consuming and costly.

View Article and Find Full Text PDF

Similar Publications

Decoding binocular color differences via EEG signals: linking ERP dynamics to chromatic disparity in CIELAB space.

Exp Brain Res

September 2025

School of Information Science and Technology, Yunnan Normal University, Kunming, 650500, China.

Famiao Mou , Zhineng Lv , Xuesong Jin , Jijun Pan , Lijun Yun

This study explores how differences in colors presented separately to each eye (binocular color differences) can be identified through EEG signals, a method of recording electrical activity from the brain. Four distinct levels of green-red color differences, defined in the CIELAB color space with constant luminance and chroma, are investigated in this study. Analysis of Event-Related Potentials (ERPs) revealed a significant decrease in the amplitude of the P300 component as binocular color differences increased, suggesting a measurable brain response to these differences.

View Article and Find Full Text PDF

Similar Publications

Using Medication Dispensation Data to Identify Clusters with Similar Prescribing Patterns in Older Adults Living with Dementia.

Drugs Aging

September 2025

Dalla Lana School of Public Health, University of Toronto, V1 06, 2075 Bayview Avenue, Toronto, ON, M4N 3M5, Canada.

Abby Emdin , Therese A Stukel , Jennifer Bethell , Xuesong Wang , Andrea Iaboni

Background And Objectives: Older adults living with dementia are a heterogeneous group, which can make studying optimal medication management challenging. Unsupervised machine learning is a group of computing methods that rely on unlabeled data-that is, where the algorithm itself is discovering patterns without the need for researchers to label the data with a known outcome. These methods may help us to better understand complex prescribing patterns in this population.

View Article and Find Full Text PDF

Similar Publications