Bioinform Adv
July 2025
Summary: The rise of large-scale single-cell RNA-seq data has introduced challenges in data processing due to its slow speed. Leveraging advancements in Graphics Processing Unit (GPU) computing ecosystems, such as and Compute Unified Device Architecture (CUDA), building on and package, we developed , a GPU-accelerated solution for large-scale single-cell data processing. delivers over a 20× speedup through GPU computing and significantly improves scalability, handling datasets of 10-20 million cells with over 1000 batches by overcoming the memory bottleneck on a single A100 card, which far surpasses s capacity of processing only 1 million cells without multi-GPU support.
View Article and Find Full Text PDFTraditional gene expression deconvolution methods assess a limited number of cell types, therefore do not capture the full complexity of the tumor microenvironment (TME). Here, we integrate nine deconvolution tools to assess 79 TME cell types in 10,592 tumors across 33 different cancer types, creating the most comprehensive analysis of the TME. In total, we found 41 patterns of immune infiltration and stroma profiles, identifying heterogeneous yet unique TME portraits for each cancer and several new findings.
View Article and Find Full Text PDFActa Neuropathol Commun
May 2025
Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disorder characterized by progressive motor neuron loss and muscle atrophy. Hyperphosphorylated aggregation of the RNA-binding protein, TDP-43, in the motor cortex and spinal cord are defining molecular features of ALS, suggesting TDP-43 dysfunction underlies disease pathogenesis. This phenomenon, however, has been difficult to recapitulate endogenously in animal models, impeding characterization of TDP-43 pathobiology in neurodegeneration.
View Article and Find Full Text PDFDorsal root ganglion (DRG) toxicity has been consistently reported as a potential safety concern after delivery of adeno-associated viruses (AAVs) containing gene-replacement vectors but has yet to be reported for RNAi-based vectors. Here, we report DRG toxicity after AAV intra-CSF delivery of an RNAi expression construct-artificial microRNA targeting superoxide dismutase 1 (SOD1)-in non-human primates (NHPs) and provide evidence that this can be recapitulated within mice. Histopathology evaluation showed that NHPs and mice develop DRG toxicity after AAV delivery, including DRG neuron degeneration and necrosis and nerve-fiber degeneration that were associated with increases in cerebrospinal fluid (CSF) and serum phosphorylated neurofilament heavy chain (pNF-H).
View Article and Find Full Text PDFIntra-tumor heterogeneity is characterized by a diverse population of tumor clones and subclones which are important drivers of tumor evolution and therapeutic response. However, accurate subclonal reconstruction at scale remains challenging. We developed a machine learning tool, CliPP, and surveyed 9,972 tumors from 32 cancer types.
View Article and Find Full Text PDFFront Genet
October 2023
Pancreatic ductal adenocarcinoma (PDAC) is a lethal disease characterized by a diverse tumor microenvironment. The heterogeneous cellular composition of PDAC makes it challenging to study molecular features of tumor cells using extracts from bulk tumor. The metabolic features in tumor cells from clinical samples are poorly understood, and their impact on clinical outcomes are unknown.
View Article and Find Full Text PDFEmerging evidence suggests that cryptic translation beyond the annotated translatome produces proteins with developmental or physiological functions. However, functions of cryptic non-canonical open reading frames (ORFs) in cancer remain largely unknown. To fill this gap and systematically identify colorectal cancer (CRC) dependency on non-canonical ORFs, we apply an integrative multiomic strategy, combining ribosome profiling and a CRISPR-Cas9 knockout screen with large-scale analysis of molecular and clinical data.
View Article and Find Full Text PDFJ Immunother Cancer
August 2023
Background: , the most mutated gene in solid cancers, has a profound impact on most hallmarks of cancer. Somatic mutations occur in high frequencies in head and neck cancers, including oral squamous cell carcinoma (OSCC). Our study aims to understand the role of gain-of-function mutation in modulating the tumor immune microenvironment (TIME) in OSCC.
View Article and Find Full Text PDFBMC Genomics
May 2023
Background: Single-cell RNA sequencing is a state-of-the-art technology to understand gene expression in complex tissues. With the growing amount of data being generated, the standardization and automation of data analysis are critical to generating hypotheses and discovering biological insights.
Results: Here, we present scRNASequest, a semi-automated single-cell RNA-seq (scRNA-seq) data analysis workflow which allows (1) preprocessing from raw UMI count data, (2) harmonization by one or multiple methods, (3) reference-dataset-based cell type label transfer and embedding projection, (4) multi-sample, multi-condition single-cell level differential gene expression analysis, and (5) seamless integration with cellxgene VIP for visualization and with CellDepot for data hosting and sharing by generating compatible h5ad files.
Whole-organ mapping was used to study molecular changes in the evolution of bladder cancer from field effects. We identified more than 100 dysregulated pathways, involving immunity, differentiation, and transformation, as initiators of carcinogenesis. Dysregulation of interleukins signified the involvement of inflammation in the incipient phases of the process.
View Article and Find Full Text PDFNat Biotechnol
November 2022
Single-cell RNA sequencing studies have suggested that total mRNA content correlates with tumor phenotypes. Technical and analytical challenges, however, have so far impeded at-scale pan-cancer examination of total mRNA content. Here we present a method to quantify tumor-specific total mRNA expression (TmS) from bulk sequencing data, taking into account tumor transcript proportion, purity and ploidy, which are estimated through transcriptomic/genomic deconvolution.
View Article and Find Full Text PDFOur previous study showed that the upregulation of peroxisome proliferator-activated receptor gamma (PPARG) could promote chemosensitivity of hypopharyngeal squamous cell carcinoma (HSCC) in chemotherapeutic treatments. Here, we acquired two more independent expression data of PPARG to validate the expression levels of PPARG in chemotherapy-sensitive patients (CSP) and its individualized variations compared to chemotherapy-non-sensitive patients (CNSP). Our results showed that overall PPARG expression was mildly downregulated (log fold change = -0.
View Article and Find Full Text PDFLimited clinical activity has been seen in osteosarcoma (OS) patients treated with immune checkpoint inhibitors (ICI). To gain insights into the immunogenic potential of these tumors, we conducted whole genome, RNA, and T-cell receptor sequencing, immunohistochemistry and reverse phase protein array profiling (RPPA) on OS specimens from 48 pediatric and adult patients with primary, relapsed, and metastatic OS. Median immune infiltrate level was lower than in other tumor types where ICI are effective, with concomitant low T-cell receptor clonalities.
View Article and Find Full Text PDFDifferential network analysis investigates how the network of connected genes changes from one condition to another and has become a prevalent tool to provide a deeper and more comprehensive understanding of the molecular etiology of complex diseases. Based on the asymptotically normal estimation of large Gaussian graphical model (GGM) in the high-dimensional setting, we developed a computationally efficient test for differential network analysis through testing the equality of two precision matrices, which summarize the conditional dependence network structures of the genes. Additionally, we applied a multiple testing procedure to infer the differential network structure with false discovery rate (FDR) control.
View Article and Find Full Text PDFiScience
November 2018
Transcriptome deconvolution in cancer and other heterogeneous tissues remains challenging. Available methods lack the ability to estimate both component-specific proportions and expression profiles for individual samples. We present DeMixT, a new tool to deconvolve high-dimensional data from mixtures of more than two components.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
May 2019
The method of Sorted L-One Penalized Estimation, or SLOPE, is a sparse regression method recently introduced by Bogdan et. al. [1] .
View Article and Find Full Text PDFIEEE Trans Biomed Eng
February 2018
Finding correlations across multiple data sets in imaging and (epi)genomics is a common challenge. Sparse multiple canonical correlation analysis (SMCCA) is a multivariate model widely used to extract contributing features from each data while maximizing the cross-modality correlation. The model is achieved by using the combination of pairwise covariances between any two data sets.
View Article and Find Full Text PDFTo explore novel molecular mechanisms underlying obesity, we applied a systems genetics framework to integrate risk genetic loci from the largest body mass index (BMI) genome-wide association studies (GWAS) meta-analysis with mRNA and microRNA profiling in adipose tissue from 200 subjects. One module was identified to be most significantly associated with obesity and other metabolic traits. We identified eight hub genes which likely play important roles in obesity metabolism and identified microRNAs that significantly negatively correlated with hub genes.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
May 2018
In this study, in order to take advantage of complementary information from different types of data for better disease status diagnosis, we combined gene expression with DNA methylation data and generated a fused network, based on which the stages of Kidney Renal Cell Carcinoma (KIRC) can be better identified. It is well recognized that a network is important for investigating the connectivity of disease groups. We exploited the potential of the network's features to identify the KIRC stage.
View Article and Find Full Text PDFBackground: Existing microarray studies of bone mineral density (BMD) have been critical for understanding the pathophysiology of osteoporosis, and have identified a number of candidate genes. However, these studies were limited by their relatively small sample sizes and were usually analyzed individually. Here, we propose a novel network-based meta-analysis approach that combines data across six microarray studies to identify functional modules from human protein-protein interaction (PPI) data, and highlight several differentially expressed genes (DEGs) and a functional module that may play an important role in BMD regulation in women.
View Article and Find Full Text PDFBioinformatics
February 2016
Motivation: In searching for genetic variants for complex diseases with deep sequencing data, genomic marker sets of high-dimensional genotypic data and sparse functional variants are quite common. Existing sequence association tests are incapable of identifying such marker sets or individual causal loci, although they appeared powerful to identify small marker sets with dense functional variants. In sequence association studies of admixed individuals, cryptic relatedness and population structure are known to confound the association analyses.
View Article and Find Full Text PDFGenet Epidemiol
December 2014
Joint adjustment of cryptic relatedness and population structure is necessary to reduce bias in DNA sequence analysis; however, existent sparse regression methods model these two confounders separately. Incorporating prior biological information has great potential to enhance statistical power but such information is often overlooked in many existent sparse regression models. We developed a unified sparse regression (USR) to incorporate prior information and jointly adjust for cryptic relatedness, population structure, and other environmental covariates.
View Article and Find Full Text PDF