Cluster-efficient pangenome graph construction with nf-core/pangenome.

Simon Heumos , Michael L Heuer , Friederike Hanssen , Lukas Heumos , Andrea Guarracino , Peter Heringer , Philipp Ehmele , Pjotr Prins , Erik Garrison , Sven Nahnsen

Bioinformatics

Quantitative Biology Center (QBiC) Tübingen, University of Tübingen, Tübingen, 72076, Germany.

Published: November 2024

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Motivation: Pangenome graphs offer a comprehensive way of capturing genomic variability across multiple genomes. However, current construction methods often introduce biases, excluding complex sequences or relying on references. The PanGenome Graph Builder (PGGB) addresses these issues. To date, though, there is no state-of-the-art pipeline allowing for easy deployment, efficient and dynamic use of available resources, and scalable usage at the same time.

Results: To overcome these limitations, we present nf-core/pangenome, a reference-unbiased approach implemented in Nextflow following nf-core's best practices. Leveraging biocontainers ensures portability and seamless deployment in High-Performance Computing (HPC) environments. Unlike PGGB, nf-core/pangenome distributes alignments across cluster nodes, enabling scalability. Demonstrating its efficiency, we constructed pangenome graphs for 1000 human chromosome 19 haplotypes and 2146 Escherichia coli sequences, achieving a two to threefold speedup compared to PGGB without increasing greenhouse gas emissions.

Availability And Implementation: nf-core/pangenome is released under the MIT open-source license, available on GitHub and Zenodo, with documentation accessible at https://nf-co.re/pangenome/docs/usage.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568064	PMC
http://dx.doi.org/10.1093/bioinformatics/btae609	DOI Listing

Publication Analysis

Top Keywords

pangenome graph

pangenome graphs

cluster-efficient pangenome

graph construction

nf-core/pangenome

construction nf-core/pangenome

nf-core/pangenome motivation

motivation pangenome

graphs offer

offer comprehensive

Similar Publications

GViNC: an innovative framework for genome graph comparison reveals hidden patterns in the genetic diversity of human populations.

NAR Genom Bioinform

September 2025

Centre for Integrative Biology and Systems Medicine (IBSE), Wadhwani School of Data Science and AI, Indian Institute of Technology (IIT) Madras, Chennai 600036, India.

Venkatesh Kamaraj , Ayam Gupta , Karthik Raman , Manikandan Narayanan , Himanshu Sinha

Genome graphs provide a powerful reference structure for representing genetic diversity. Their structure emphasizes the polymorphic regions in a collection of genomes, enabling network-based comparisons of population-level variation. However, current tools are limited in their ability to quantify and compare structural features across large genome graphs.

View Article and Find Full Text PDF

Similar Publications

Structural variants are enriched in deleterious visible phenotypes in .

bioRxiv

August 2025

Alejandra Samano , Matthew Musat , Mihir Junaghare , Asad Ahmad , Mehlum Ali

Genome structural variants (SVs) comprise a sizable portion of functionally important genetic variation in all organisms; yet, many SVs evade discovery using short reads. While long-read sequencing can find the hidden SVs, the role of SVs in variation in organismal traits remains largely unclear. To address this gap, we investigate the molecular basis of 50 classical phenotypes in 11 strains using highly contiguous genome assemblies generated with Oxford Nanopore long reads.

View Article and Find Full Text PDF

Similar Publications

A comprehensive water buffalo pangenome reveals extensive structural variation linked to population-specific signatures of selection.

Gigascience

January 2025

The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian EH25 9RG, UK.

Fazeela Arshad , Siddharth Jayaraman , Andrea Talenti , Rachel Owen , Muhammad Mohsin

Background: Water buffalo is a cornerstone livestock species in many low- and middle-income countries, yet major gaps persist in its genomic characterization-complicated by the divergent karyotypes of its two subspecies (swamp and river). Such genomic complexity makes water buffalo a particularly good candidate for the use of graph genomics, which can capture variation missed by linear reference approaches. However, the utility of this approach to improve water buffalo has been largely unexplored.

View Article and Find Full Text PDF

Similar Publications

A Near Telomere-To-Telomere Genome Assembly and Graph-Based Pangenome of Tartary Buckwheat (Fagopyrum tataricum).

Plant Biotechnol J

August 2025

College of Biological Sciences and Technology, Taiyuan Normal University, Taiyuan, China.

Wendy Li , Hanfei Liang , Jilin Sun , Xiao Zhang , Qiang He

View Article and Find Full Text PDF

Similar Publications

Pangenome-based genome inference using integer programming.

Genome Res

August 2025

Indian Institute of Science;

Ghanshyam Chandra , Md Helal Hossen , Stephan Scholz , Alexander T Dilthey , Daniel Gibney

Affordable genotyping methods are essential in genomics. Commonly used genotyping methods primarily support single nucleotide variants and short indels but neglect structural variants. Additionally, accuracy of read alignments to a reference genome is unreliable in highly polymorphic and repetitive regions, further impacting genotyping performance.

View Article and Find Full Text PDF

Similar Publications