Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Most existing dimensionality reduction and clustering packages for single-cell RNA-seq (scRNA-seq) data deal with dropouts by heavy modeling and computational machinery. Here, we introduce CIDR (Clustering through Imputation and Dimensionality Reduction), an ultrafast algorithm that uses a novel yet very simple implicit imputation approach to alleviate the impact of dropouts in scRNA-seq data in a principled manner. Using a range of simulated and real data, we show that CIDR improves the standard principal component analysis and outperforms the state-of-the-art methods, namely t-SNE, ZIFA, and RaceID, in terms of clustering accuracy. CIDR typically completes within seconds when processing a data set of hundreds of cells and minutes for a data set of thousands of cells. CIDR can be downloaded at https://github.com/VCCRI/CIDR .

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5371246PMC
http://dx.doi.org/10.1186/s13059-017-1188-0DOI Listing

Publication Analysis

Top Keywords

clustering imputation
8
single-cell rna-seq
8
dimensionality reduction
8
scrna-seq data
8
data set
8
data
6
cidr
5
cidr ultrafast
4
ultrafast accurate
4
clustering
4

Similar Publications

In engineering structure performance monitoring, capturing real-time on-site data and conducting precise analysis are critical for assessing structural condition and safety. However, equipment instability and complex on-site environments often lead to data anomalies and gaps, hindering accurate performance evaluation. This study, conducted within a wind farm reinforcement project in Shandong Province, addresses these challenges by focusing on anomaly detection and data imputation for weld nail strain, anchor cable axial force, and concrete strain.

View Article and Find Full Text PDF

Background: Red-cell alloimmunisation is a preventable driver of haemolytic disease of the fetus and newborn, yet most risk scores rely on single-parameter thresholds and overlook clinically important heterogeneity.

Objective: To uncover latent phenotypes among sensitised pregnancies by clustering routinely collected clinical and immunohaematologic variables.

Methods: We retrospectively analysed 2084 antenatal records (2020 - 2021).

View Article and Find Full Text PDF

Single-cell RNA sequencing (scRNA-seq) has revolutionized molecular biology and genomics by enabling the profiling of individual cell types, providing insights into cellular heterogeneity. Deep learning methods have become popular in single cell analysis for tasks such as dimension reduction, cell clustering, and data imputation. In this work, we introduce DropDAE, a denoising autoencoder (DAE) model enhanced with contrastive learning, to specifically address the dropout events in scRNA-seq data, where certain genes show very low or even zero expression levels due to technical limitations.

View Article and Find Full Text PDF

Comparing Multiple Imputation Methods to Address Missing Patient Demographics in Immunization Information Systems: Retrospective Cohort Study.

JMIR Public Health Surveill

August 2025

Scientific Services - Analytics, Scientific Technologies Corporation (United States), 411 S 1st St, Phoenix, AZ, 85004, United States, 1 480-745-8500.

Background: Immunization Information Systems (IIS) and surveillance data are essential for public health interventions and programming; however, missing data are often a challenge, potentially introducing bias and impacting the accuracy of vaccine coverage assessments, particularly in addressing disparities.

Objective: This study aimed to evaluate the performance of 3 multiple imputation methods, Stata's (StataCorp LLC) multiple imputation using chained equations (MICE), scikit-learn's Iterative-Imputer, and Python's miceforest package, in managing missing race and ethnicity data in large-scale surveillance datasets. We compared these methodologies in their ability to preserve demographic distribution, computational efficiency, and performed G-tests on contingency tables to obtain likelihood ratio statistics to assess the association between race and ethnicity and flu vaccination status.

View Article and Find Full Text PDF

Development and Validation of an Artificial Intelligence-Driven Model for Accurate Classification of Erythrodermic Psoriasis Severity: Erythrodermic Psoriasis Integrated Classification System (EPICS).

Am J Clin Dermatol

August 2025

Department of Dermatology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases, Beijing, China. jinhongzhong@2

Background: Erythrodermic psoriasis is a rare subtype of psoriasis with widespread skin lesions, with some patients experiencing severe systemic symptoms.

Objective: We aimed to develop and validate an artificial intelligence-driven model for accurate classification of erythrodermic psoriasis severity by integrating clinical and laboratory indicators.

Methods: A retrospective cohort study was conducted at Peking Union Medical College Hospital (2005-22).

View Article and Find Full Text PDF