Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Systems-level analyses, such as differential gene expression analysis, co-expression analysis, and metabolic pathway reconstruction, depend on the accuracy of the transcriptome. Multiple tools exist to perform transcriptome assembly from RNAseq data. However, assembling high quality transcriptomes is still not a trivial problem. This is especially the case for non-model organisms where adequate reference genomes are often not available. Different methods produce different transcriptome models and there is no easy way to determine which are more accurate. Furthermore, having alternative-splicing events exacerbates such difficult assembly problems. While benchmarking transcriptome assemblies is critical, this is also not trivial due to the general lack of true reference transcriptomes.

Results: In this study, we first provide a pipeline to generate a set of the simulated benchmark transcriptome and corresponding RNAseq data. Using the simulated benchmarking datasets, we compared the performance of various transcriptome assembly approaches including both de novo and genome-guided methods. The results showed that the assembly performance deteriorates significantly when alternative transcripts (isoforms) exist or for genome-guided methods when the reference is not available from the same genome. To improve the transcriptome assembly performance, leveraging the overlapping predictions between different assemblies, we present a new consensus-based ensemble transcriptome assembly approach, ConSemble.

Conclusions: Without using a reference genome, ConSemble using four de novo assemblers achieved an accuracy up to twice as high as any de novo assemblers we compared. When a reference genome is available, ConSemble using four genome-guided assemblies removed many incorrectly assembled contigs with minimal impact on correctly assembled contigs, achieving higher precision and accuracy than individual genome-guided methods. Furthermore, ConSemble using de novo assemblers matched or exceeded the best performing genome-guided assemblers even when the transcriptomes included isoforms. We thus demonstrated that the ConSemble consensus strategy both for de novo and genome-guided assemblers can improve transcriptome assembly. The RNAseq simulation pipeline, the benchmark transcriptome datasets, and the script to perform the ConSemble assembly are all freely available from: http://bioinfolab.unl.edu/emlab/consemble/ .

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8532302PMC
http://dx.doi.org/10.1186/s12859-021-04434-8DOI Listing

Publication Analysis

Top Keywords

transcriptome assembly
24
improve transcriptome
12
genome-guided methods
12
reference genome
12
novo assemblers
12
transcriptome
11
assembly
9
consensus-based ensemble
8
assembly rnaseq
8
rnaseq data
8

Similar Publications

CETN3 deficiency induces microcephaly by disrupting neural stem/progenitor cell fate through impaired centrosome assembly and RNA splicing.

EMBO Mol Med

September 2025

Institute for Regenerative Medicine, Medical Innovation Center and State Key Laboratory of Cardiovascular Diseases, Shanghai East Hospital, National Stem Cell Translational Resource Center & Ministry of Education Stem Cell Resource Center, Frontier Science Center for Stem Cell Research, School of Li

Primary microcephaly, a rare congenital condition characterized by reduced brain size, occurs due to impaired neurogenesis during brain development. Through whole-exome sequencing, we identified compound heterozygous loss-of-function mutations in CENTRIN 3 (CETN3) in a 5-year-old patient with primary microcephaly. As CETN3 has not been previously linked to microcephaly, we investigated its potential function in neurodevelopment in human pluripotent stem cell-derived cerebral organoids.

View Article and Find Full Text PDF

There is limited understanding of the impact of anti-IL5 treatment on nasal polyp tissue biology in chronic rhinosinusitis with nasal polyps (CRSwNP). This study examined nasal polyp tissue cellular proteome and transcriptome responses to anti-IL5 treatment in CRSwNP utilising spatial profiling. GeoMx™ Digital Spatial Profiling (DSP) of 80 proteins and 1,833 mRNA targets in the polyp stroma and the whole transcriptome (18,815 mRNA targets) in polyp epithelia was undertaken on sinonasal biopsies collected from 20 individuals with eosinophilic CRSwNP before and after 16 and 24 weeks of mepolizumab treatment.

View Article and Find Full Text PDF

Genome Assembly and Molecular Analysis Uncover the Salt Tolerance Mechanism of Coastal Plant Scaevola sericea.

Plant Cell Environ

October 2025

Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.

View Article and Find Full Text PDF

Silicosis is a fatal occupational lung disease characterized by persistent inflammation and irreversible fibrosis. However, the pathogenesis of silicosis is currently unclear. In this study, a mouse model of silicosis was established by intranasal instillation of silica, and transcriptomic alterations in lung tissues were assessed by mRNA-sequencing.

View Article and Find Full Text PDF

Mutualistic endosymbiosis is a cornerstone of evolutionary innovation, enabling organisms to exploit diverse niches unavailable to individual species. However, our knowledge about the early evolutionary stage of this relationship remains limited. The association between the ciliate Tetrahymena utriculariae and its algal endosymbiont Micractinium tetrahymenae indicates an incipient stage of photoendosymbiosis.

View Article and Find Full Text PDF