Harpy: a pipeline for processing haplotagging linked-read data.

Bioinform Adv

Department of Natural Resources and the Environment, Cornell University, Ithaca, NY 14853, United States.

Published: June 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Motivation: Haplotagging is a method for linked-read sequencing, which leverages the cost-effectiveness and throughput of short-read sequencing while retaining part of the long-range haplotype information captured by long-read sequencing. Despite its utility and advantages over similar methods, existing linked-read analytical pipelines are incompatible with haplotagging data.

Results: We describe Harpy, a modular and user-friendly software pipeline for processing all stages of haplotagged linked-read data, from raw sequence data to phased genotypes and structural variant detection.

Availability And Implementation: https://github.com/pdimens/harpy.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12198493PMC
http://dx.doi.org/10.1093/bioadv/vbaf133DOI Listing

Publication Analysis

Top Keywords

pipeline processing
8
linked-read data
8
harpy pipeline
4
processing haplotagging
4
linked-read
4
haplotagging linked-read
4
data motivation
4
motivation haplotagging
4
haplotagging method
4
method linked-read
4

Similar Publications

Pyomo: Accidentally outrunning the bear.

Patterns (N Y)

July 2025

Sandia National Laboratories, Albuquerque, NM, USA.

Pyomo is an open-source optimization modeling software that has undergone significant evolution since its inception in 2008. Pyomo has evolved to enhance flexibility, solver integration, and community engagement. Modern collaborative tools for open-source software have facilitated the development of new Pyomo functionality and improved our development process through automated testing and performance-tracking pipelines.

View Article and Find Full Text PDF

Traditionally, clinical devices are designed, tested and improved through lengthy and expensive laboratory experiments and clinical trials [1]. More recently, computational methods have allowed for rapid testing, speeding up the design process and enabling far more complete searches of design space. While computational models cannot fully capture the complexities of biological systems, they provide valuable insights into crucial underlying mechanisms, such as the effects of fluid-structure interactions (FSIs).

View Article and Find Full Text PDF

Summary: Dynamic models represent a powerful tool for studying complex biological processes, ranging from cell signalling to cell differentiation. Building such models often requires computationally demanding modelling workflows, such as model exploration and parameter estimation. We developed two Julia-based tools: SBMLImporter.

View Article and Find Full Text PDF

Summary: In the era of large data, the cloud is increasingly used as a computing environment, necessitating the development of cloud-compatible pipelines that can provide uniform analysis across disparate biological datasets. The Warp Analysis Research Pipelines (WARP) repository is a GitHub repository of open-source, cloud-optimized workflows for biological data processing that are semantically versioned, tested, and documented. A companion repository, WARP-Tools, hosts Docker containers and custom tools used in WARP workflows.

View Article and Find Full Text PDF

In recent years, amino acids have garnered extensive attention as environmentally friendly, small-dose additives for modulating hydrate formation and aggregation behavior. Amino acids, due to their amphiphilic nature, can adsorb at the gas-liquid interface and on hydrate crystal surfaces, thereby modifying interfacial properties and influencing crystal growth patterns. In our measurements, the amino acids displayed a concentration-dependent "double effect".

View Article and Find Full Text PDF