98%
921
2 minutes
20
The size of microbial sequence databases continues to grow beyond the abilities of existing alignment tools. We introduce LexicMap, a nucleotide sequence alignment tool for efficiently querying moderate-length sequences (>250 bp) such as a gene, plasmid or long read against up to millions of prokaryotic genomes. We construct a small set of probe k-mers, which are selected to efficiently sample the entire database to be indexed such that every 250-bp window of each database genome contains multiple seed k-mers, each with a shared prefix with one of the probes. Storing these seeds in a hierarchical index enables fast and low-memory alignment. We benchmark both accuracy and potential to scale to databases of millions of bacterial genomes, showing that LexicMap achieves comparable accuracy to state-of-the-art methods but with greater speed and lower memory use. Our method supports querying at scale and within minutes, which will be useful for many biological applications across epidemiology, ecology and evolution.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1038/s41587-025-02812-8 | DOI Listing |
Nat Biotechnol
September 2025
European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK.
The size of microbial sequence databases continues to grow beyond the abilities of existing alignment tools. We introduce LexicMap, a nucleotide sequence alignment tool for efficiently querying moderate-length sequences (>250 bp) such as a gene, plasmid or long read against up to millions of prokaryotic genomes. We construct a small set of probe k-mers, which are selected to efficiently sample the entire database to be indexed such that every 250-bp window of each database genome contains multiple seed k-mers, each with a shared prefix with one of the probes.
View Article and Find Full Text PDFmSystems
September 2025
Center for Infection Biology, School of Basic Medical Sciences, Tsinghua University, Beijing, China.
Human-associated metagenomic data often contain human nucleic acid information, which can affect the accuracy of microbial classification or raise ethical concerns. These reads are typically removed through alignment to the human genome using various metagenomic mapping tools or human reference genomes, followed by filtration before metagenomic analysis. In this study, we conducted a comprehensive analysis to identify the optimal combination of alignment software and human reference genomes using benchmarking data.
View Article and Find Full Text PDFJ Appl Clin Med Phys
September 2025
Department of Radiation Oncology, Virginia Commonwealth University, Richmond, Virginia, USA.
Purpose: Real‑time magnetic resonance-guided radiation therapy (MRgRT) integrates MRI with a linear accelerator (Linac) for gating and adaptive radiotherapy, which requires robust image‑quality assurance over a large field of view (FOV). Specialized phantoms capable of accommodating this extensive FOV are therefore essential. This study compares the performance of four commercial MRI phantoms on a 0.
View Article and Find Full Text PDFBMC Infect Dis
September 2025
Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden.
Background: Escherichia coli ST131 and clade H30Rx are the most prevalent extended-spectrum β-lactamase-producing E. coli (ESBL-EC) causing bacteremia and urinary tract infections globally and in Sweden. Previous studies have linked ST131-H30Rx with septic shock and mortality, as well as prolonged carriage.
View Article and Find Full Text PDFEcotoxicol Environ Saf
September 2025
Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental & Resource Science, Zhejiang University, Hangzhou 310058, China; Zhejiang Provincial Key Laboratory of Subtropic Soil and Plant Nutrition, Zhejiang University, Hangzhou 310058, China. Ele
Seven plant growth-promoting bacteria (PGPB) were isolated from extracts of surface-sterilized Sedum alfredii Hance. Among the seven isolates, the strain SaRB5 identified as Stenotrophomonas maltophilia through 16S rDNA sequence analysis, exhibited highest levels of heavy metal resistance and plant growth-promoting traits. SaRB5 tolerated high concentrations of cadmium (Cd) (1.
View Article and Find Full Text PDF