Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

It is now common to have a modest to large number of features on individuals with complex diseases. Unsupervised analyses, such as clustering with and without preprocessing by Principle Component Analysis (PCA), is widely used in practice to uncover subgroups in a sample. However, in many modern studies features are often highly correlated and noisy (e.g. SNP's, -omics, quantitative imaging markers, and electronic health record data). The practical performance of clustering approaches in these settings remains unclear. Through extensive simulations and empirical examples applying Gaussian Mixture Models and related clustering methods, we show these approaches (including variants of kmeans, VarSelLCM, HDClassifier, and Fisher-EM) can have very poor performance in many settings. We also show the poor performance is often driven by either an explicit or implicit assumption by the clustering algorithm that high variance features are relevant while lower variance features are irrelevant, called the variance as relevance assumption. We develop practical pre-processing approaches that improve analysis performance in some cases. This work offers practical guidance on the strengths and limitations of unsupervised clustering approaches in modern data analysis applications.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11338589PMC
http://dx.doi.org/10.1080/00949655.2024.2329976DOI Listing

Publication Analysis

Top Keywords

clustering approaches
8
poor performance
8
variance features
8
clustering
5
limitations clustering
4
clustering pca
4
pca correlated
4
correlated noise
4
noise common
4
common modest
4

Similar Publications

Purpose: This study aims to cross-culturally validate the Dutch version of the Lymphedema Symptom Intensity and Distress Survey-Head and Neck version 2.0 (LSIDS-H&N v2.0).

View Article and Find Full Text PDF

Harnessing biomarkers to guide immunotherapy in esophageal cancer: toward precision oncology.

Clin Transl Oncol

September 2025

Department of Basic Science, College of Medicine, Princess Nourah bint Abdulrahman, University, P.O.Box 84428, 11671, Riyadh, Saudi Arabia.

Esophageal cancer (EC) is one of the most serious health issues around the world, ranking seventh among the most lethal types of cancer and eleventh among the most common types of cancer worldwide. Traditional therapies-such as surgery, chemotherapy, and radiation therapy-often yield limited success, especially in the advanced stages of EC, prompting the pursuit of novel and more effective treatment strategies. Immunotherapy has emerged as a promising option; nonetheless, its clinical success is hindered by variable patient responses.

View Article and Find Full Text PDF

The application of metabolomics to the water quality monitoring system, biological early warning system (BEWS), has been proposed; however, its development has not been attempted due to challenges such as high inter-individual variability and invasive sampling requirements in metabolomics applications. In this study, we employed an extracellular metabolomics (exo-metabolomics) approach using Daphnia magna to overcome these limitations and evaluate its utility in field river water conditions. From BEWS flow-through chambers, we collected exo-metabolites under ambient, copper exposure (0-80 μg/L), and post-exposure conditions.

View Article and Find Full Text PDF

Effectiveness of a multimodal information technology-based hand hygiene strategies on reducing healthcare-associated infections in nursing homes: A cluster-randomized controlled trial.

Nurse Educ Pract

August 2025

College of Nursing, Kaohsiung Medical University, 100 Shih-Chuan 1st Rd., Sanmin District, Kaohsiung 80708, Taiwan; Super Intendent Office, Kaohsiung Medical University Hospital, Taiwan. Electronic address:

Aim: To assess the effectiveness of a multimodal information technology-based hand hygiene strategy in improving knowledge, compliance, accuracy, and healthcare-associated infections density in Taiwan's nursing homes.

Background: Hand hygiene is the most effective and cost-efficient method for preventing healthcare-associated infections. However, compliance rates among healthcare workers in Taiwan remain low (3.

View Article and Find Full Text PDF

Immunological heterogeneity in Ménière's disease: CD4+ T cell subset profiling reveals three distinct Immunophenotypes.

J Neuroimmunol

September 2025

Department of Otorhinolaryngology Head and Neck Surgery, Zhongshan Hospital, Fudan University, Shanghai 200032, China; Department of Vertigo Diagnosis and Treatment Center, Zhongshan Hospital, Fudan University, Shanghai 200032, China. Electronic address:

Background: Ménière's disease (MD) remains a heterogeneous disorder with unclear pathogenesis. While immune dysregulation has been implicated, the specific role of CD4+ T cell subsets and their clinical correlations in MD are poorly understood.

Methods: We performed comprehensive immune profiling of 30 MD patients and 27 healthy controls using flow cytometry to analyze six CD4+ T cell subsets (Th1, Th2, Th17, Treg, TGF-β+, TNF-α+) and multiplex cytokine analysis of 16 inflammatory mediators plus IgE.

View Article and Find Full Text PDF