Label Accuracy in Electronic Health Records and Its Impact on Machine Learning Models for Early Prediction of Gestational Diabetes: 3-Step Retrospective Validation Study.

Mark Germaine , Amy C O'Higgins , Brendan Egan , Graham Healy

JMIR Med Inform

School of Computing, Dublin City University, Glasnevin, Dublin, D09 V209, Ireland, 353 1 700 8803.

Published: August 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Background: Several studies have used electronic health records (EHRs) to build machine learning models predicting the likelihood of developing gestational diabetes mellitus (GDM) later in pregnancy, but none have described validation of the GDM "label" within the EHRs.

Objective: This study examines the accuracy of GDM diagnoses in EHRs compared with a clinical team database (CTD) and their impact on machine learning models.

Methods: EHRs from 2018 to 2022 were validated against CTD data to identify true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). Logistic regression models were trained and tested using both EHR and validated labels, whereafter simulated label noise was introduced to increase FP and FN rates. Model performance was assessed using the area under the receiver operating characteristic curve (ROC AUC) and average precision (AP).

Results: Among 3952 patients, 3388 (85.7%) were correctly identified with GDM in both databases, while 564 cases lacked a GDM label in EHRs, and 771 were missing a corresponding CTD label. Overall, 32,928 (87.5%) of cases were TN, 3388 (9%) TP, 771 (2%) FP, and 564 (1.5%) FN. The model trained and tested with validated labels achieved an ROC AUC of 0.817 and an AP of 0.450, whereas the same model tested using EHR labels achieved 0.814 and 0.395, respectively. Increased label noise during training led to gradual declines in ROC AUC and AP, while noise in the test set, especially elevated FP rates, resulted in marked performance drops.

Conclusions: Discrepancies between EHR and CTD diagnoses had a limited impact on model training but significantly affected performance evaluation when present in the test set, emphasizing the importance of accurate data validation.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12377786	PMC
http://dx.doi.org/10.2196/72938	DOI Listing

Publication Analysis

Top Keywords

machine learning

roc auc

electronic health

health records

impact machine

learning models

gestational diabetes

trained tested

tested ehr

validated labels

Similar Publications

Preclinical Ischemic Stroke Multicenter (PRISM) Trials Collective Statement: Opportunities, Challenges, and Recommendations for a New Era.

Stroke

September 2025

Department of Neurology, Yale School of Medicine, New Haven, CT (L.H.S.).

Cenk Ayata , Philip M Bath , Ana M Planas , Stuart M Allan , Johannes Boltze

Preclinical stroke research faces a critical translational gap, with animal studies failing to reliably predict clinical efficacy. To address this, the field is moving toward rigorous, multicenter preclinical randomized controlled trials (mpRCTs) that mimic phase 3 clinical trials in several key components. This collective statement, derived from experts involved in mpRCTs, outlines considerations for designing and executing such trials.

View Article and Find Full Text PDF

Similar Publications

An updated Bioconductor workflow for correlation profiling subcellular proteomics.

F1000Res

September 2025

Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK.

Charlotte Hutchings , Thomas Krueger , Oliver M Crook , Laurent Gatto , Kathryn S Lilley

Background: Subcellular localisation is a determining factor of protein function. Mass spectrometry-based correlation profiling experiments facilitate the classification of protein subcellular localisation on a proteome-wide scale. In turn, static localisations can be compared across conditions to identify differential protein localisation events.

View Article and Find Full Text PDF

Similar Publications

Chronological differentiation of printed or handwritten text and stamps based on hyperspectral imaging and convolutional neural networks.

Anal Methods

September 2025

College of Science, Kunming University of Science and Technology, Kunming, 650500, China.

Xiaoquan Lu , Jianqiang Zhang , Fan Li , Jiaquan Wu , Xinyu Zhang

To address the technical challenges associated with determining the chronological order of overlapping stamps and textual content in forensic document examination, this study proposes a novel non-destructive method that integrates hyperspectral imaging (HSI) with convolutional neural networks (CNNs). A multi-type cross-sequence dataset was constructed, comprising 60 samples of handwriting-stamp sequences and 20 samples of printed text-stamp sequences, all subjected to six months of natural aging. Spectral responses were collected across the 400-1000 nm range in the overlapping regions.

View Article and Find Full Text PDF

Similar Publications

Recent advancements in artificial intelligence-powered cancer prediction from oral microbiome.

Periodontol 2000

September 2025

Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.

Negin Soghli , Aminollah Khormali , Darius Mahboubi , Aimin Peng , Patricia A Miguez

Oral cancer is a major global health burden, ranking sixth in prevalence, with oral squamous cell carcinoma (OSCC) being the most common type. Importantly, OSCC is often diagnosed at late stages, underscoring the need for innovative methods for early detection. The oral microbiome, an active microbial community within the oral cavity, holds promise as a biomarker for the prediction and progression of cancer.

View Article and Find Full Text PDF

Similar Publications

The Role of Deep Cerebral Tracts in Predicting Postoperative Aphasia: An nTMS-Based Investigation of the Corticothalamic Fibers.

Hum Brain Mapp

September 2025

Department of Neurosurgery, Heidelberg University Hospital, Heidelberg, Germany.

Zixu Bao , Haosu Zhang , Maximilian Schwendner , Axel Schröder , Bernhard Meyer

Postoperative aphasia (POA) is a common complication in patients undergoing surgery for language-eloquent lesions. This study aimed to enhance the prediction of POA by leveraging preoperative navigated transcranial magnetic stimulation (nTMS) language mapping and diffusion tensor imaging (DTI)-based tractography, incorporating deep learning (DL) algorithms. One hundred patients with left-hemispheric lesions were retrospectively enrolled (43 developed postoperative aphasia, as the POA group; 57 did not, as the non-aphasia (NA) group).

View Article and Find Full Text PDF

Similar Publications