98%
921
2 minutes
20
Background: Several studies have used electronic health records (EHRs) to build machine learning models predicting the likelihood of developing gestational diabetes mellitus (GDM) later in pregnancy, but none have described validation of the GDM "label" within the EHRs.
Objective: This study examines the accuracy of GDM diagnoses in EHRs compared with a clinical team database (CTD) and their impact on machine learning models.
Methods: EHRs from 2018 to 2022 were validated against CTD data to identify true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). Logistic regression models were trained and tested using both EHR and validated labels, whereafter simulated label noise was introduced to increase FP and FN rates. Model performance was assessed using the area under the receiver operating characteristic curve (ROC AUC) and average precision (AP).
Results: Among 3952 patients, 3388 (85.7%) were correctly identified with GDM in both databases, while 564 cases lacked a GDM label in EHRs, and 771 were missing a corresponding CTD label. Overall, 32,928 (87.5%) of cases were TN, 3388 (9%) TP, 771 (2%) FP, and 564 (1.5%) FN. The model trained and tested with validated labels achieved an ROC AUC of 0.817 and an AP of 0.450, whereas the same model tested using EHR labels achieved 0.814 and 0.395, respectively. Increased label noise during training led to gradual declines in ROC AUC and AP, while noise in the test set, especially elevated FP rates, resulted in marked performance drops.
Conclusions: Discrepancies between EHR and CTD diagnoses had a limited impact on model training but significantly affected performance evaluation when present in the test set, emphasizing the importance of accurate data validation.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12377786 | PMC |
http://dx.doi.org/10.2196/72938 | DOI Listing |
Stroke
September 2025
Department of Neurology, Yale School of Medicine, New Haven, CT (L.H.S.).
Preclinical stroke research faces a critical translational gap, with animal studies failing to reliably predict clinical efficacy. To address this, the field is moving toward rigorous, multicenter preclinical randomized controlled trials (mpRCTs) that mimic phase 3 clinical trials in several key components. This collective statement, derived from experts involved in mpRCTs, outlines considerations for designing and executing such trials.
View Article and Find Full Text PDFF1000Res
September 2025
Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK.
Background: Subcellular localisation is a determining factor of protein function. Mass spectrometry-based correlation profiling experiments facilitate the classification of protein subcellular localisation on a proteome-wide scale. In turn, static localisations can be compared across conditions to identify differential protein localisation events.
View Article and Find Full Text PDFAnal Methods
September 2025
College of Science, Kunming University of Science and Technology, Kunming, 650500, China.
To address the technical challenges associated with determining the chronological order of overlapping stamps and textual content in forensic document examination, this study proposes a novel non-destructive method that integrates hyperspectral imaging (HSI) with convolutional neural networks (CNNs). A multi-type cross-sequence dataset was constructed, comprising 60 samples of handwriting-stamp sequences and 20 samples of printed text-stamp sequences, all subjected to six months of natural aging. Spectral responses were collected across the 400-1000 nm range in the overlapping regions.
View Article and Find Full Text PDFPeriodontol 2000
September 2025
Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.
Oral cancer is a major global health burden, ranking sixth in prevalence, with oral squamous cell carcinoma (OSCC) being the most common type. Importantly, OSCC is often diagnosed at late stages, underscoring the need for innovative methods for early detection. The oral microbiome, an active microbial community within the oral cavity, holds promise as a biomarker for the prediction and progression of cancer.
View Article and Find Full Text PDFHum Brain Mapp
September 2025
Department of Neurosurgery, Heidelberg University Hospital, Heidelberg, Germany.
Postoperative aphasia (POA) is a common complication in patients undergoing surgery for language-eloquent lesions. This study aimed to enhance the prediction of POA by leveraging preoperative navigated transcranial magnetic stimulation (nTMS) language mapping and diffusion tensor imaging (DTI)-based tractography, incorporating deep learning (DL) algorithms. One hundred patients with left-hemispheric lesions were retrospectively enrolled (43 developed postoperative aphasia, as the POA group; 57 did not, as the non-aphasia (NA) group).
View Article and Find Full Text PDF