Publications by authors named "Matthew Churpek"

Suspected infection requiring hospitalisation has highly heterogenous presentation. Yet, variances in host response and its implications are largely unknown. In this multicentre cohort of 3802 individual patients presenting to the Emergency Department (ED) with suspected infection requiring hospitalisation, we apply uniform manifold approximation and projections and K-means clustering to 29 plasma proteins to identify biologically discrete host response clusters.

View Article and Find Full Text PDF

Objective: To evaluate the efficacy of digital twins developed using a large language model (LLaMA-3), fine-tuned with Low-Rank Adapters (LoRA) on ICU physician notes, and to determine whether specialty-specific training enhances treatment recommendation accuracy compared to other ICU specialties or zero-shot baselines.

Materials And Methods: Digital twins were created using LLaMA-3 fine-tuned on discharge summaries from the MIMIC-III dataset, where medications were masked to construct training and testing datasets. The medical ICU dataset (1,000 notes) was used for evaluation, and performance was assessed using BERTScore and ROUGE-L.

View Article and Find Full Text PDF

Objective: Risk prediction models are used in hospitals to identify pediatric patients at risk of clinical deterioration, enabling timely interventions and rescue. The objective of this study was to develop a new explainer algorithm that uses a patient's clinical notes to generate text-based explanations for risk prediction alerts.

Materials And Methods: We conducted a retrospective study of 39 406 patient admissions to the American Family Children's Hospital at the University of Wisconsin-Madison (2009-2020).

View Article and Find Full Text PDF

Objectives: Body temperature trajectories of infected patients are associated with dynamic clinical and immune responses to infection. Our objective was to evaluate the association between temperature trajectory subphenotypes and cardiac dysfunction determined by echocardiography.

Design: Retrospective cohort study.

View Article and Find Full Text PDF

Background: The early detection of clinical deterioration and timely intervention for hospitalized patients can improve patient outcomes. The currently existing early warning systems rely on variables from structured data, such as vital signs and laboratory values, and do not incorporate other potentially predictive data modalities. Because respiratory failure is a common cause of deterioration, chest radiographs are often acquired in patients with clinical deterioration, which may be informative for predicting their risk of intensive care unit (ICU) transfer.

View Article and Find Full Text PDF

Background: Implementing machine learning models to identify clinical deterioration in the wards is associated with decreased morbidity and mortality. However, these models have high false positive rates and only use structured data.

Objective: We aimed to compare models with and without information from clinical notes for predicting deterioration.

View Article and Find Full Text PDF

Background: In 2018, the U.S. heart allocation policy underwent a major change designed to increase the transplantation of the most medically urgent candidates.

View Article and Find Full Text PDF

Importance: Unrecognized deterioration among hospitalized children is associated with a high risk of mortality and morbidity. The current approach to pediatric risk stratification is fragmented, as each hospital unit (emergency, ward, or intensive care) uses different tools for predicting specific outcomes.

Objective: To develop a machine learning model for the early detection of deterioration across all units, thereby enabling a unified risk assessment throughout the patient's hospital stay.

View Article and Find Full Text PDF

In the evolving landscape of clinical Natural Language Generation (NLG), assessing abstractive text quality remains challenging, as existing methods often overlook generative task complexities. This work aimed to examine the current state of automated evaluation metrics in NLG in healthcare. To have a robust and well-validated baseline with which to examine the alignment of these metrics, we created a comprehensive human evaluation framework.

View Article and Find Full Text PDF

Short-form Video Addiction (SVA), a novel digital addiction of the modern world, proliferates among young adults and is not formally diagnosable. SVA detection from resulting bio-signals is crucial to prevent its adverse impacts. Existing formal methods involve large and expensive neuro-imaging devices in laboratory setups that are intrusive and not feasible to use in daily life.

View Article and Find Full Text PDF

Importance: Current protocols to triage life support use scores that are biased and inaccurate.

Objectives: To determine if adding age to triage protocols used in disaster scenarios improves the identification of critically ill patients likely to survive.

Design, Setting, And Participants: Observational cohort study from March 1, 2020, to March 1, 2022, at 22 hospitals in three networks, divided into derivation (12 hospitals) and validation cohorts (ten hospitals).

View Article and Find Full Text PDF

Electronic Health Records (EHRs) store vast amounts of clinical information that are difficult for healthcare providers to summarize and synthesize relevant details to their practice. To reduce cognitive load on providers, generative AI with Large Language Models have emerged to automatically summarize patient records into clear, actionable insights and offload the cognitive burden for providers. However, LLM summaries need to be precise and free from errors, making evaluations on the quality of the summaries necessary.

View Article and Find Full Text PDF

Clinicians aim to provide treatments that will result in the best outcome for each patient. Ideally, treatment decisions are based on evidence from randomised clinical trials. Randomised trials conventionally report an aggregated difference in outcomes between patients in each group, known as an average treatment effect.

View Article and Find Full Text PDF

Key Points: We developed and validated a multimodal (structured and unstructured data) model to predict moderate to severe AKI using multicenter data. This multimodal AKI risk score accurately identifies patients who will develop stage 2 AKI over 2 days earlier than serum creatinine alone. The multimodal model performed better than a model based solely on structured data and performed similarly during temporal and site-based validation.

View Article and Find Full Text PDF

Adults with opioid use disorder (OUD) are at increased risk for opioid-related complications and repeated hospital admissions. Routine screening for patients at risk for an OUD to prevent complications is not standard practice in many hospitals, leading to missed opportunities for intervention. The adoption of electronic health records (EHRs) and advancements in artificial intelligence (AI) offer a scalable approach to systematically identify at-risk patients for evidence-based care.

View Article and Find Full Text PDF

Background: Early detection of clinical deterioration using machine-learning early warning scores may improve outcomes. However, most implemented scores were developed using logistic regression, only underwent retrospective validation, and were not tested in important subgroups.

Objective: The objective of our multicenter retrospective and prospective observational study was to develop and prospectively validate a gradient-boosted machine model (eCARTv5) for identifying clinical deterioration on the wards.

View Article and Find Full Text PDF

Objective: Implementing machine learning models to identify clinical deterioration on the wards is associated with improved outcomes. However, these models have high false positive rates and only use structured data. Therefore, we aim to compare models with and without information from clinical notes for predicting deterioration.

View Article and Find Full Text PDF

Background: Electronic health records (EHRs) and routine documentation practices play a vital role in patients' daily care, providing a holistic record of health, diagnoses, and treatment. However, complex and verbose EHR narratives can overwhelm health care providers, increasing the risk of diagnostic inaccuracies. While large language models (LLMs) have showcased their potential in diverse language tasks, their application in health care must prioritize the minimization of diagnostic errors and the prevention of patient harm.

View Article and Find Full Text PDF

Objectives: To describe the deployment of pediatric Calculated Assessment of Risk and Triage (pCART), a machine learning (ML) model to predict the risk of the direct ward to the ICU transfer within 12 hours, and the associated improved outcomes among hospitalized children.

Design: Pre- vs. post-implementation study.

View Article and Find Full Text PDF

Background: Postdischarge venous thromboembolism (pdVTE) is a life-threatening complication following resection for pancreatic cancer (PC). While national guidelines recommend extended chemoprophylaxis for all, adherence is low and ranges from 1.5 to 44%.

View Article and Find Full Text PDF

This study aimed to evaluate the performance of machine learning models for predicting readmission of patients with chronic obstructive pulmonary disease (COPD) based on administrative data and chart review data. The study analyzed 4327 patient encounters from the University of Chicago Medicine to assess the risk of readmission within 90 days after an acute exacerbation of COPD. Two random forest prediction models were compared.

View Article and Find Full Text PDF

Objective: To evaluate large language models (LLMs) for pre-test diagnostic probability estimation and compare their uncertainty estimation performance with a traditional machine learning classifier.

Materials And Methods: We assessed 2 instruction-tuned LLMs, Mistral-7B-Instruct and Llama3-70B-chat-hf, on predicting binary outcomes for Sepsis, Arrhythmia, and Congestive Heart Failure (CHF) using electronic health record (EHR) data from 660 patients. Three uncertainty estimation methods-Verbalized Confidence, Token Logits, and LLM Embedding+XGB-were compared against an eXtreme Gradient Boosting (XGB) classifier trained on raw EHR data.

View Article and Find Full Text PDF

Introduction: Obesity, defined as a body mass index ≥30 kg/m, is a major public health concern in the United States. Preventative approaches are essential, but they are limited by an inability to accurately predict individuals at highest risk of weight gain. Our objective was to develop accurate weight gain prediction models using the National Institutes of Health All of Us dataset.

View Article and Find Full Text PDF

Objectives: Applying large language models (LLMs) to the clinical domain is challenging due to the context-heavy nature of processing medical records. Retrieval-augmented generation (RAG) offers a solution by facilitating reasoning over large text sources. However, there are many parameters to optimize in just the retrieval system alone.

View Article and Find Full Text PDF