Unsupervised clustering of longitudinal clinical measurements in electronic health records.

PLOS Digit Health

Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America.

Published: October 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Longitudinal electronic health records (EHR) can be utilized to identify patterns of disease development and progression in real-world settings. Unsupervised temporal matching algorithms are being repurposed to EHR from signal processing- and protein-sequence alignment tasks where they have shown immense promise for gaining insight into disease. The robustness of these algorithms for classifying EHR clinical data remains to be determined. Timeseries compiled from clinical measurements, such as blood pressure, have far more irregularity in sampling and missingness than the data for which these algorithms were developed, necessitating a systematic evaluation of these methods. We applied 30 state-of-the-art unsupervised machine learning algorithms to 6,912 systematically generated simulated clinical datasets across five parameters. These algorithms included eight temporal matching algorithms with fourteen partitional and eight fuzzy clustering methods. Nemenyi tests were used to determine differences in accuracy using the Adjusted Rand Index (ARI). Dynamic time warping and its lower-bound variants had the highest accuracies across all cohorts (median ARI>0.70). All 30 methods were better at discriminating classes with differences in magnitude compared to differences in trajectory shapes. Missingness impacted accuracies only when classes were different by trajectory shape. The method with the highest ARI was then used to cluster a large pediatric metabolic syndrome (MetS) cohort (N = 43,426). We identified three unique childhood BMI patterns with high average cluster consensus (>70%). The algorithm identified a cluster with consistently high BMI which had the greatest risk of MetS, consistent with prior literature (OR = 4.87, 95% CI: 3.93-6.12). While these algorithms have been shown to have similar accuracies for regular timeseries, their accuracies in clinical applications vary substantially in discriminating differences in shape and especially with moderate to high missingness (>10%). This systematic assessment also shows that the most robust algorithms tested here can derive meaningful insights from longitudinal clinical data.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11478862PMC
http://dx.doi.org/10.1371/journal.pdig.0000628DOI Listing

Publication Analysis

Top Keywords

longitudinal clinical
8
clinical measurements
8
electronic health
8
health records
8
temporal matching
8
algorithms
8
matching algorithms
8
clinical data
8
clinical
6
unsupervised clustering
4

Similar Publications

Background: Existing longitudinal cohort study data and associated biospecimen libraries provide abundant opportunities to efficiently examine new hypotheses through retrospective specimen testing. Outcome-dependent sampling (ODS) methods offer a powerful alternative to random sampling when testing all available specimens is not feasible or biospecimen preservation is desired. For repeated binary outcomes, a common ODS approach is to extend the case-control framework to the longitudinal setting.

View Article and Find Full Text PDF

Background And Objectives: The relationship between insomnia and cognitive decline is poorly understood. We investigated associations between chronic insomnia, longitudinal cognitive outcomes, and brain health in older adults.

Methods: From the population-based Mayo Clinic Study of Aging, we identified cognitively unimpaired older adults with or without a diagnosis of chronic insomnia who underwent annual neuropsychological assessments (z-scored global cognitive scores and cognitive status) and had quantified serial imaging outcomes (amyloid-PET burden [centiloid] and white matter hyperintensities from MRI [WMH, % of intracranial volume]).

View Article and Find Full Text PDF

Background: Crohn's disease (CD) is a chronic inflammatory disease, with a heterogeneous clinical course, which can affect any segment of the gastrointestinal tract. Data on the natural history of CD in developing countries are rare.

Objective: to delineate the clinical, epidemiological, and longitudinal characteristics of CD patients at a Brazilian referral center.

View Article and Find Full Text PDF

This study aimed to evaluate the longitudinal effect of dentition status on the perceived mobility limitation of community-dwelling Brazilian older adults. This cohort study used data from individuals who participated in the second (2006), third (2010), and fourth (2015) waves of the Health Well-being and Aging Study, conducted in the urban region of the city of São Paulo, Brazil, with adults aged 60 years and older. Mobility limitation was assessed in all waves according to reports of difficulty in performing seven activities, with higher scores representing a higher number of limitations.

View Article and Find Full Text PDF

A half-day workshop improved palliative care clinicians' ability to integrate psychological concepts into serious illness communication but created demand for longitudinal learning. To pilot "Process Rounds," a four-session, case-based, adapted psychotherapeutic supervision group reinforcing formulation, countertransference, and mindful intervention. Workshop graduates from four cohorts were invited; 25/143 enrolled.

View Article and Find Full Text PDF