Integrating single-cell and single-nucleus datasets improves bulk RNA-seq deconvolution.

Adriana Ivich , Casey S Greene

bioRxiv

Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.

Published: August 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Bulk RNA-seq deconvolution typically uses single-cell RNA-sequencing (scRNA-seq) references, but some cell types are only detectable through single-nucleus RNA sequencing (snRNA-seq). Because snRNA-seq captures nuclear, but not cytoplasmic, transcripts, direct use as a reference could reduce deconvolution accuracy. Here, we systematically benchmark strategies to integrate both modalities, focusing on transformations and gene-filtering approaches that harmonize snRNA-seq with scRNA-seq references. Across four diverse tissues, we evaluated principal component-based shifts, conditional and non-conditional variational autoencoders (scVI), and the removal of cross-modality differentially expressed genes (DEGs). While all methods improved performance relative to untransformed snRNA-seq, filtering consistent cross-modality DEGs delivered the greatest gains, often matching or surpassing scRNA-only references. Conditional scVI performed comparably and was especially effective when matched scRNA-snRNA cell types were unavailable. In real adipose bulk samples without ground truth, DEG pruning and conditional scVI provided the most robust cell-fraction estimates across donors and transformations. Together, these results demonstrate that scRNA-seq should be prioritized as the reference when available, with snRNA-seq appended only after filtering cross-modality DEGs. For less-characterized systems where DEG information is limited, conditional scVI offers a practical alternative. Our findings provide clear guidelines for modality-aware integration, enabling near-scRNA-seq accuracy in bulk deconvolution workflows.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12393506	PMC
http://dx.doi.org/10.1101/2025.08.20.671333	DOI Listing

Publication Analysis

Top Keywords

conditional scvi

bulk rna-seq

rna-seq deconvolution

scrna-seq references

cell types

cross-modality degs

snrna-seq

integrating single-cell

single-cell single-nucleus

single-nucleus datasets

Similar Publications

Integrating single-cell and single-nucleus datasets improves bulk RNA-seq deconvolution.

bioRxiv

August 2025

Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.

Adriana Ivich , Casey S Greene

View Article and Find Full Text PDF

Similar Publications

Integrating single-cell RNA-seq datasets with substantial batch effects.

bioRxiv

February 2024

Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany.

Karin Hrovatin , Amir Ali Moinfar , Luke Zappia , Alejandro Tejada Lapuerta , Ben Lengerich

Integration of single-cell RNA-sequencing (scRNA-seq) datasets has become a standard part of the analysis, with conditional variational autoencoders (cVAE) being among the most popular approaches. Increasingly, researchers are asking to map cells across challenging cases such as cross-organs, species, or organoids and primary tissue, as well as different scRNA-seq protocols, including single-cell and single-nuclei. Current computational methods struggle to harmonize datasets with such substantial differences, driven by technical or biological variation.

View Article and Find Full Text PDF

Similar Publications