Accurately assembling nanopore sequencing data of highly pathogenic bacteria.

BMC Genomics

Institute of Bacterial Infections and Zoonoses, Federal Research Institute for Animal Health, Friedrich-Loeffler-Institute, Naumburger Str. 96a, 07743, Jena, Germany.

Published: August 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Bacterial genome exploration and outbreak analysis rely heavily on robust whole-genome sequencing and bioinformatics analysis. Widely-used genomic methods, such as genotyping and detection of genetic markers demand high sequencing accuracy and precise genome assembly for reliable results.

Methods: To assess the utility of nanopore sequencing for genotyping highly pathogenic bacteria with low mutation rates, we sequenced six reference strains using Oxford Nanopore Technologies (ONT) R10.4.1 chemistry and Illumina and evaluated different assembly strategies. The publicly available RefSeq assemblies were chosen as the ground truth. Publicly available sequencing data from key foodborne and public-health-related bacterial pathogens were examined to provide a broader context for the analysis.

Results: While for Bacillus (Ba.) anthracis an almost perfect assembly was achieved, results varied for other species. For Brucella (Br.) spp., the final assemblies comprised five to 46 different nucleotides in comparison to Sanger-sequenced references. For some key foodborne and public-health-related bacterial pathogens (Klebsiella (K.) variicola, Listeria spp., Mycobacterium (M.) tuberculosis, Staphylococcus (Sta.) aureus, and Streptococcus (Str.) pyogenes) perfect genomes were obtained. Enhanced basecalling models have generally improved assembly accuracy, however, for certain species such as Br. abortus, older models have produced higher accuracy. While long-read polishing mainly improves assembly quality with only one round needed, our results indicate that this process may also degrade assembly quality. Overall, 81% of the observed errors in ONT assemblies were located within coding sequences (CDS). Furthermore, we found that methylation caused 6.5% of the errors, and the bacterial methylation-aware medaka polishing model reduced the number of errors linked to methylation. Core-genome Multilocus Sequence Typing (cgMLST) analysis revealed allele differences in Ba. anthracis, Br. abortus, and Francisella (F.) tularensis for some assemblers, although with fewer than five allele differences. In the case of Br. melitensis, some assemblies included five allele differences, whereas for Br. suis the correct cgMLST alleles were observed.

Conclusions: Assembling nanopore data from pathogenic bacteria vary in quality across different species and methods. However, errors persist in the final assemblies, including within cgMLST loci, influencing the reliability of outbreak predictions. Nevertheless, specific combinations of existing tools can generate perfect genome assemblies from bacterial ONT sequencing data for outbreak analysis without short-read polishing.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12392509PMC
http://dx.doi.org/10.1186/s12864-025-11793-6DOI Listing

Publication Analysis

Top Keywords

sequencing data
12
pathogenic bacteria
12
allele differences
12
assembling nanopore
8
nanopore sequencing
8
highly pathogenic
8
outbreak analysis
8
key foodborne
8
foodborne public-health-related
8
public-health-related bacterial
8

Similar Publications

Gene dysregulation impairs placental angiogenesis in allogeneic pig pregnancies.

Anim Reprod Sci

September 2025

Department of Biomedical & Clinical Sciences (BKV), BKH/Obstetrics & Gynecology, Faculty of Medicine and Health Sciences, Linköping University, Linköping SE-58185, Sweden.

Embryo transfer (ET) is a valuable reproductive technology in pigs, albeit its efficiency remains significantly lower than that of natural mating or artificial insemination (AI), owing to high embryonic death rates. Critical for embryo survival and pregnancy success is the placenta, which supports conceptus development through nutrient exchange, hormone production, and immune modulation. Alterations in placental development and function may therefore underlie the reduced efficiency of ET.

View Article and Find Full Text PDF

ANASFV: a workflow for African swine fever virus whole-genome analysis.

Microb Genom

September 2025

Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong, PR China.

African swine fever virus (ASFV) is highly transmissible and can cause up to 100% mortality in pigs. The virus has spread across most regions of Asia and Europe, resulting in the deaths of millions of pigs. A deep understanding of the genetic diversity and evolutionary dynamics of ASFV is necessary to effectively manage outbreaks.

View Article and Find Full Text PDF

Background And Aims: Crop wild relatives (CWRs) are key resources for enhancing agricultural resilience, providing genetic traits that can improve pest resistance, abiotic stress tolerance, and nutritional composition in domesticated crops. Within the mustard family (Brassicaceae) this is especially significant in the Brassiceae tribe, which includes economically important genera for agriculture such as Brassica and Sinapis. However, while breeding programmes have historically focused on major crops within this tribe, the potential of their wild relatives, particularly for underutilised and minor crops, remains insufficiently explored.

View Article and Find Full Text PDF

Objective: To explore B cell infiltration-related genes in endometriosis (EM) and investigate their potential as diagnostic biomarkers.

Methods: Gene expression data from the GSE51981 dataset, containing 77 endometriosis and 34 control samples, were analyzed to detect differentially expressed genes (DEGs). The xCell algorithm was applied to estimate the infiltration levels of 64 immune and stromal cell types, focusing on B cells and naive B cells.

View Article and Find Full Text PDF

MicroRNAs (miRNAs) are critical regulators of gene expression in cancer biology, yet their spatial dynamics within tumor microenvironments (TMEs) remain underexplored due to technical limitations in current spatial transcriptomics (ST) technologies. To address this gap, we present STmiR, a novel XGBoost-based framework for spatially resolved miRNA activity prediction. STmiR integrates bulk RNA-seq data (TCGA and CCLE) with spatial transcriptomics profiles to model nonlinear miRNA-mRNA interactions, achieving high predictive accuracy (Spearman's ρ > 0.

View Article and Find Full Text PDF