98%
921
2 minutes
20
Rapid technological advancements in sequencing technologies allow producing cost effective and high volume sequencing data. Processing this data for real-time clinical diagnosis is potentially time-consuming if done on a single computing node. This work presents a complete variant calling workflow, implemented using the Message Passing Interface (MPI) to leverage the benefits of high bandwidth interconnects. This solution (GenMPI) is portable and flexible, meaning it can be deployed to any private or public cluster/cloud infrastructure. Any alignment or variant calling application can be used with minimal adaptation. To achieve high performance, compressed input data can be streamed in parallel to alignment applications while uncompressed data can use internal file seek functionality to eliminate the bottleneck of streaming input data from a single node. Alignment output can be directly stored in multiple chromosome-specific SAM files or a single SAM file. After alignment, a distributed queue using MPI RMA (Remote Memory Access) atomic operations is created for sorting, indexing, marking of duplicates (if necessary) and variant calling applications. We ensure the accuracy of variants as compared to the original single node methods. We also show that for 300x coverage data, alignment scales almost linearly up to 64 nodes (8192 CPU cores). Overall, this work outperforms existing big data based workflows by a factor of two and is almost 20% faster than other MPI-based implementations for alignment without any extra memory overheads. Sorting, indexing, duplicate removal and variant calling is also scalable up to 8 nodes cluster. For pair-end short-reads (Illumina) data, we integrated the BWA-MEM aligner and three variant callers (GATK HaplotypeCaller, DeepVariant and Octopus), while for long-reads data, we integrated the Minimap2 aligner and three different variant callers (DeepVariant, DeepVariant with WhatsHap for phasing (PacBio) and Clair3 (ONT)). All codes and scripts are available at: https://github.com/abs-tudelft/gen-mpi.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TCBBIO.2025.3595409 | DOI Listing |
Cancer Immunol Immunother
September 2025
Department of Medical Oncology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.
Whole blood (WB) transcriptomics offers a minimal-invasive method to assess patients' immune system. This study aimed to identify transcriptional patterns in WB associated with clinical outcomes in patients treated with immune checkpoint inhibitors (ICIs). We performed RNA-sequencing on pre-treatment WB samples from 145 patients with advanced cancer.
View Article and Find Full Text PDFmBio
September 2025
Department of Biology, Laboratory of Molecular Cell Biology, KU Leuven, Leuven, Flanders, Belgium.
Echinocandins, which target the fungal β-1,3-glucan synthase (Fks), are essential for treating invasive fungal infections, yet resistance is increasingly reported. While resistance typically arises through mutations in Fks hotspots, emerging evidence suggests a contributing role of changes in membrane sterol composition due to mutations. Here, we present a clinical case of () in which combined mutations in and , but not alone, appear to confer echinocandin resistance.
View Article and Find Full Text PDFBrief Bioinform
August 2025
Department of Respiratory Medicine, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157, Xiwu Road, Xincheng District, Xi'an 710004, China.
Accurate tumor mutation burden (TMB) quantification is critical for immunotherapy stratification, yet remains challenging due to variability across sequencing platforms, tumor heterogeneity, and variant calling pipelines. Here, we introduce TMBquant, an explainable AI-powered caller designed to optimize TMB estimation through dynamic feature selection, ensemble learning, and automated strategy adaptation. Built upon the H2O AutoML framework, TMBquant integrates variant features, minimizes classification errors, and enhances both accuracy and stability across diverse datasets.
View Article and Find Full Text PDFMicrobiol Spectr
September 2025
Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Rio de Janeiro, Brazil.
is a commensal bacterium that colonizes the gut of humans and animals and is a major opportunistic pathogen, known for causing multidrug-resistant healthcare-associated infections (HAIs). Its ability to thrive in diverse environments and disseminate antimicrobial resistance genes (ARGs) across ecological niches highlights the importance of understanding its ecological, evolutionary, and epidemiological dynamics. The CRISPR2 locus has been used as a valuable marker for assessing clonality and phylogenetic relationships in .
View Article and Find Full Text PDFNAR Genom Bioinform
September 2025
Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark.
Advances in Oxford Nanopore Technologies (ONT) with the introduction of the r10.4.1 flow cell have reduced the sequencing error rates to <1%.
View Article and Find Full Text PDF