EHR-ML: A data-driven framework for designing machine learning applications with electronic health records.

Int J Med Inform

Department of Infectious Diseases, The Alfred Hospital and Central Clinical School, Monash University, Melbourne, 3000, VIC, Australia; School of Computing Technologies, RMIT University, Melbourne, 3000, VIC, Australia. Electronic address:

Published: April 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Objective: The healthcare landscape is experiencing a transformation with the integration of Artificial Intelligence (AI) into traditional analytic workflows. However, its integration faces challenges resulting in a crisis of generalisability. Key obstacles include; 1) Insufficient consideration of local contextual factors, such as institution-specific data formats, practices, and protocols, which can lead to variability in clinical practices across different institutions. 2) ad-hoc data preparation and design of machine learning strategies. 3) manual subjective adjustment of design parameters resulting in sub-optimal performance. 4) EHR specific challenges regarding data biases affecting the model outcomes and unique intermittent temporal nature of the data necessitating specialised handling 5) lack of cross-institutional data validations.

Methods: To address these challenges, EHR-ML, provides an easy to use structured framework for designing optimum machine learning applications in a data-driven manner. The framework supports ingestion of local institutional electronic health records (EHRs) and process standardisation. The study design and parameter optimisation is done in a fully data-driven evidence-based approach. It seamlessly integrating with existing quality control tools. To handle the unique characteristics of the EHR data, it offers customisable ensemble models. It enables the acquisition of EHR data from diverse systems and harmonise them into common formats following international standards.

Results: The effectiveness of the EHR-ML is demonstrated through a series of case studies. These studies highlight its capability to develop high-performance models in a fully automated manner, consistently surpassing the performance of traditional methodologies. Furthermore, they exhibited strong generalisability across diverse healthcare settings.

Discussion And Conclusion: EHR-ML enhances the clinical relevance and accuracy of predictive models by incorporating local context into machine learning applications. Additionally, by providing an user-friendly fully-automated framework, it facilitates rapid hypothesis testing aimed to generate localised biomedical knowledge.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ijmedinf.2025.105816DOI Listing

Publication Analysis

Top Keywords

machine learning
16
learning applications
12
framework designing
8
electronic health
8
health records
8
ehr data
8
data
7
ehr-ml
4
ehr-ml data-driven
4
framework
4

Similar Publications

Introduction: Vision language models (VLMs) combine image analysis capabilities with large language models (LLMs). Because of their multimodal capabilities, VLMs offer a clinical advantage over image classification models for the diagnosis of optic disc swelling by allowing a consideration of clinical context. In this study, we compare the performance of non-specialty-trained VLMs with different prompts in the classification of optic disc swelling on fundus photographs.

View Article and Find Full Text PDF

Multi-Omics and Clinical Validation Identify Key Glycolysis- and Immune-Related Genes in Sepsis.

Int J Gen Med

September 2025

Department of Geriatrics, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu, 610072, People's Republic of China.

Background: Sepsis is characterized by profound immune and metabolic perturbations, with glycolysis serving as a pivotal modulator of immune responses. However, the molecular mechanisms linking glycolytic reprogramming to immune dysfunction remain poorly defined.

Methods: Transcriptomic profiles of sepsis were obtained from the Gene Expression Omnibus.

View Article and Find Full Text PDF

Accurate differentiation between persistent vegetative state (PVS) and minimally conscious state and estimation of recovery likelihood in patients in PVS are crucial. This study analyzed electroencephalography (EEG) metrics to investigate their relationship with consciousness improvements in patients in PVS and developed a machine learning prediction model. We retrospectively evaluated 19 patients in PVS, categorizing them into two groups: those with improved consciousness ( = 7) and those without improvement ( = 12).

View Article and Find Full Text PDF

Artificial intelligence (AI) is a technique or tool to simulate or emulate human "intelligence." Precision medicine or precision histology refers to the subpopulation-tailored diagnosis, therapeutics, and management of diseases with its sociocultural, behavioral, genomic, transcriptomic, and pharmaco-omic implications. The modern decade experiences a quantum leap in AI-based models in various aspects of daily routines including practice of precision medicine and histology.

View Article and Find Full Text PDF

Introduction: Spinal cord injury (SCI) presents a significant burden to patients, families, and the healthcare system. The ability to accurately predict functional outcomes for SCI patients is essential for optimizing rehabilitation strategies, guiding patient and family decision making, and improving patient care.

Methods: We conducted a retrospective analysis of 589 SCI patients admitted to a single acute rehabilitation facility and used the dataset to train advanced machine learning algorithms to predict patients' rehabilitation outcomes.

View Article and Find Full Text PDF