Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Advances in long-read sequencing (LRS) technologies continue to make whole-genome sequencing more complete, affordable, and accurate. LRS provides significant advantages over short-read sequencing approaches, including phased de novo genome assembly, access to previously excluded genomic regions, and discovery of more complex structural variants (SVs) associated with disease. Limitations remain with respect to cost, scalability, and platform-dependent read accuracy and the tradeoffs between sequence coverage and sensitivity of variant discovery are important experimental considerations for the application of LRS. We compare the genetic variant-calling precision and recall of Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio) HiFi platforms over a range of sequence coverages. For read-based applications, LRS sensitivity begins to plateau around 12-fold coverage with a majority of variants called with reasonable accuracy (F score above 0.5), and both platforms perform well for SV detection. Genome assembly increases variant-calling precision and recall of SVs and indels in HiFi data sets with HiFi outperforming ONT in quality as measured by the F score of assembly-based variant call sets. While both technologies continue to evolve, our work offers guidance to design cost-effective experimental strategies that do not compromise on discovering novel biology.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10760522PMC
http://dx.doi.org/10.1101/gr.278070.123DOI Listing

Publication Analysis

Top Keywords

variant-calling precision
12
precision recall
12
long-read sequencing
8
technologies continue
8
genome assembly
8
whole-genome long-read
4
sequencing
4
sequencing downsampling
4
downsampling variant-calling
4
recall advances
4

Similar Publications

Accurate tumor mutation burden (TMB) quantification is critical for immunotherapy stratification, yet remains challenging due to variability across sequencing platforms, tumor heterogeneity, and variant calling pipelines. Here, we introduce TMBquant, an explainable AI-powered caller designed to optimize TMB estimation through dynamic feature selection, ensemble learning, and automated strategy adaptation. Built upon the H2O AutoML framework, TMBquant integrates variant features, minimizes classification errors, and enhances both accuracy and stability across diverse datasets.

View Article and Find Full Text PDF

Performance comparison of germline variant calling tools in sporadic disease cohorts.

Mol Genet Genomics

September 2025

Human Phenome Institute, MOE Key Laboratory of Contemporary Anthropology, Zhangjiang Fudan International Innovation Center, Fudan University, 825 Zhangheng Road, Shanghai, 201203, China.

Accurate variant calling is essential for next-generation sequencing (NGS)-based diagnosis of rare diseases, yet most benchmarking studies have focused on standard cell lines or trio-based samples, with limited relevance to sporadic cases. Here, we systematically compared the performance of DeepVariant and GATK HaplotypeCaller in two Chinese cohorts of patients with sporadic epilepsy (EP) and autism spectrum disorder (ASD). DeepVariant exhibited higher precision and sensitivity in detecting single nucleotide variants (SNVs), while GATK showed a distinct advantage in identifying rare variants, which are often key to understanding the genetic basis of rare diseases.

View Article and Find Full Text PDF

Modular and cloud-based bioinformatics pipelines for high-confidence biomarker detection in cancer immunotherapy clinical trials.

PLoS One

August 2025

The Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, Maryland, United States of America.

Background: The Cancer Immune Monitoring and Analysis Centers - Cancer Immunologic Data Center (CIMAC-CIDC) network aims to improve cancer immunotherapy by providing harmonized molecular assays and standardized bioinformatics analysis.

Results: In response to evolving bioinformatics standards and the migration of the CIDC to the National Cancer Institute (NCI), we undertook the enhancement of the CIDC's extant whole exome sequencing (WES) and RNA sequencing (RNA-Seq) pipelines. Leveraging open-source tools and cloud-based technologies, we implemented modular workflows using Snakemake and Docker for efficient deployment on the Google Cloud Platform (GCP).

View Article and Find Full Text PDF

RNA sequencing (RNA-seq) is a widely used method in transcriptomics research, offering insights into gene expression, variant discovery, and, when deconvoluted, the cellular composition of complex tissues. However, existing RNA-seq pipelines frequently emphasize gene expression analysis and often lack cell deconvolution and variant calling. To address these limitations, we present RnaXtract, a comprehensive and user-friendly pipeline designed to maximize extraction of valuable information from bulk RNA-seq data.

View Article and Find Full Text PDF

Background: Calling structural variants (SVs), i.e., genomic alterations of ≥50bp, from whole genome short-read data remains challenging, as existing callers are known to lack accuracy and robustness.

View Article and Find Full Text PDF