Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Missing gene expression values are a common issue in RNAseq-based analyses of gene expression. However, an analysis of genetic and environmental factors contributing to data missingness in RNAseq-based assessment of gene expression has never been conducted. In this study we tried to identify factors in RNAseq data missingness. We used RNAseq data from 66 lung adenocarcinoma tumors and corresponding adjacent normal lung tissues. We found a strong negative association between the gene expression level and missingness, supporting the idea that the borderline expression level is a key contributor to missingness. In a more detailed analysis, the relationship between gene expression and missingness was more complex: while the expected negative association between missingness and the expression level was observed for genes with low missingness, mean expression spiked at the right end of the distribution which included genes with very high missingness. We hypothesized that genes with a high missing rate include not only genes with borderline expression but also genes with high expression in some individuals but no expression in others (true biological missingness, TBM). The results of the comparative analysis of missingness in smokers and nonsmokers, an examination of the proportion of known tobacco smoke-sensitive genes by missing rate, and gene enrichment analysis support the hypothesis. We argue that it would be beneficial first to check data for the presence of genes with true biological missingness. The presence of highly expressed genes with missingness is an indication of TBM related to inter-individual variation in gene expression level. The results of our analysis call for caution in indiscriminatory imputation of missing values. When true biological missingness is present, it is advisable to identify genes with true biological missingness and analyze them separately because including such genes in imputation will lead to a bias: expression values will be assigned to a subset of the genes that are not expressed.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12373826PMC
http://dx.doi.org/10.1038/s41598-025-14395-0DOI Listing

Publication Analysis

Top Keywords

gene expression
28
expression level
16
true biological
16
biological missingness
16
missingness
15
expression
14
genes high
12
genes
11
missingness rnaseq-based
8
gene
8

Similar Publications

Background: C-C motif chemokine ligand 3 (CCL3) is a crucial chemokine that plays a fundamental role in the immune microenvironment and is closely linked to the development of various cancers. Despite its importance, there is limited research regarding the expression and function of CCL3 in nasopharyngeal carcinoma (NPC). Therefore, this study seeks to examine the expression of CCL3 and assess its clinical significance in NPC using bioinformatics analysis and experiments.

View Article and Find Full Text PDF

Background: Prostate cancer is one of the most common malignancies in males worldwide. Serum prostate-specific antigen is a frequently employed biomarker in the diagnosis and risk stratification of prostate cancer; however, it is known for its low predictive accuracy for disease progression. New prognostic biomarkers are needed to distinguish aggressive prostate cancer from low-risk disease.

View Article and Find Full Text PDF

Central nervous system tumors with BCL6 corepressor (BCOR) internal tandem duplications (ITDs) constitute a rare, recently characterized pediatric neoplasm with distinct molecular and histopathological features. To date, 69 cases have been documented in the literature, including our institutional case. These neoplasms predominantly occur in young children, with the cerebellum representing the most frequent anatomical location.

View Article and Find Full Text PDF

Macrophage Migration Inhibitory Factor (MIF) is a pleiotropic cytokine that acts as a central regulator of inflammation and immune responses across diverse organ systems. Functioning upstream in immune activation cascades, MIF influences macrophage polarization, T and B cell differentiation, and cytokine expression through CD74, CXCR2/4/7, and downstream signaling via NF-κB, ERK1/2, and PI3K/AKT pathways. This review provides a comprehensive analysis of MIF's mechanistic functions under both physiological and pathological conditions, highlighting its dual role as a protective mediator during acute stress and as a pro-inflammatory amplifier in chronic disease.

View Article and Find Full Text PDF

Background/aims: Ubiquitin D (UBD), a member of the ubiquitin-like modifier (UBL) family, is significantly overexpressed in various cancers and is positively correlated with tumor progression. However, the role and underlying mechanisms of UBD in rheumatoid arthritis (RA) remain poorly understood. This study aimed to investigate the effects of UBD knockdown on the progression of RA.

View Article and Find Full Text PDF