Numt identification and removal with RtN!

Bioinformatics

Department of Microbiology, Immunology and Genetics.

Published: December 2020


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Motivation: Assays in mitochondrial genomics rely on accurate read mapping and variant calling. However, there are known and unknown nuclear paralogs that have fundamentally different genetic properties than that of the mitochondrial genome. Such paralogs complicate the interpretation of mitochondrial genome data and confound variant calling.

Results: Remove the Numts! (RtN!) was developed to categorize reads from massively parallel sequencing data not based on the expected properties and sequence identities of paralogous nuclear encoded mitochondrial sequences, but instead using sequence similarity to a large database of publicly available mitochondrial genomes. RtN! removes low-level sequencing noise and mitochondrial paralogs while not impacting variant calling, while competing methods were shown to remove true variants from mitochondrial mixtures.

Availability And Implementation: https://github.com/Ahhgust/RtN.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa642DOI Listing

Publication Analysis

Top Keywords

variant calling
8
mitochondrial genome
8
mitochondrial
7
numt identification
4
identification removal
4
removal rtn!
4
rtn! motivation
4
motivation assays
4
assays mitochondrial
4
mitochondrial genomics
4

Similar Publications

Echinocandins, which target the fungal β-1,3-glucan synthase (Fks), are essential for treating invasive fungal infections, yet resistance is increasingly reported. While resistance typically arises through mutations in Fks hotspots, emerging evidence suggests a contributing role of changes in membrane sterol composition due to mutations. Here, we present a clinical case of () in which combined mutations in and , but not alone, appear to confer echinocandin resistance.

View Article and Find Full Text PDF

Accurate tumor mutation burden (TMB) quantification is critical for immunotherapy stratification, yet remains challenging due to variability across sequencing platforms, tumor heterogeneity, and variant calling pipelines. Here, we introduce TMBquant, an explainable AI-powered caller designed to optimize TMB estimation through dynamic feature selection, ensemble learning, and automated strategy adaptation. Built upon the H2O AutoML framework, TMBquant integrates variant features, minimizes classification errors, and enhances both accuracy and stability across diverse datasets.

View Article and Find Full Text PDF

is a commensal bacterium that colonizes the gut of humans and animals and is a major opportunistic pathogen, known for causing multidrug-resistant healthcare-associated infections (HAIs). Its ability to thrive in diverse environments and disseminate antimicrobial resistance genes (ARGs) across ecological niches highlights the importance of understanding its ecological, evolutionary, and epidemiological dynamics. The CRISPR2 locus has been used as a valuable marker for assessing clonality and phylogenetic relationships in .

View Article and Find Full Text PDF

Advances in Oxford Nanopore Technologies (ONT) with the introduction of the r10.4.1 flow cell have reduced the sequencing error rates to <1%.

View Article and Find Full Text PDF

Performance comparison of germline variant calling tools in sporadic disease cohorts.

Mol Genet Genomics

September 2025

Human Phenome Institute, MOE Key Laboratory of Contemporary Anthropology, Zhangjiang Fudan International Innovation Center, Fudan University, 825 Zhangheng Road, Shanghai, 201203, China.

Accurate variant calling is essential for next-generation sequencing (NGS)-based diagnosis of rare diseases, yet most benchmarking studies have focused on standard cell lines or trio-based samples, with limited relevance to sporadic cases. Here, we systematically compared the performance of DeepVariant and GATK HaplotypeCaller in two Chinese cohorts of patients with sporadic epilepsy (EP) and autism spectrum disorder (ASD). DeepVariant exhibited higher precision and sensitivity in detecting single nucleotide variants (SNVs), while GATK showed a distinct advantage in identifying rare variants, which are often key to understanding the genetic basis of rare diseases.

View Article and Find Full Text PDF