Hal: an automated pipeline for phylogenetic analyses of genomic data.

Barbara Robbertse , Ryan J Yoder , Alex Boyd , John Reeves , Joseph W Spatafora

PLoS Curr

National Center for Biotechnology Information, Bethesda, Maryland; Peace Corps; Oregon State University and Bonzi Software Development.

Published: February 2011

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

The rapid increase in genomic and genome-scale data is resulting in unprecedented levels of discrete sequence data available for phylogenetic analyses. Major analytical impasses exist, however, prior to analyzing these data with existing phylogenetic software. Obstacles include the management of large data sets without standardized naming conventions, identification and filtering of orthologous clusters of proteins or genes, and the assembly of alignments of orthologous sequence data into individual and concatenated super alignments. Here we report the production of an automated pipeline, Hal that produces multiple alignments and trees from genomic data. These alignments can be produced by a choice of four alignment programs and analyzed by a variety of phylogenetic programs. In short, the Hal pipeline connects the programs BLASTP, MCL, user specified alignment programs, GBlocks, ProtTest and user specified phylogenetic programs to produce species trees. The script is available at sourceforge (http://sourceforge.net/projects/bio-hal/). The results from an example analysis of Kingdom Fungi are briefly discussed.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3038436	PMC
http://dx.doi.org/10.1371/currents.RRN1213	DOI Listing

Publication Analysis

Top Keywords

automated pipeline

phylogenetic analyses

genomic data

sequence data

alignment programs

phylogenetic programs

data

phylogenetic

programs

hal automated

Similar Publications

The tree-based pipeline optimization tool: Tackling biomedical research problems with genetic programming and automated machine learning.

Patterns (N Y)

July 2025

Cedars-Sinai Medical Center, Los Angeles, CA, USA.

Jose Guadalupe Hernandez , Anil Kumar Saini , Attri Ghosh , Jason H Moore

The tree-based pipeline optimization tool (TPOT) is one of the earliest automated machine learning (ML) frameworks developed for optimizing ML pipelines, with an emphasis on addressing the complexities of biomedical research. TPOT uses genetic programming to explore a diverse space of pipeline structures and hyperparameter configurations in search of optimal pipelines. Here, we provide a comparative overview of the conceptual similarities and implementation differences between the previous and latest versions of TPOT, focusing on two key aspects: (1) the representation of ML pipelines and (2) the underlying algorithm driving pipeline optimization.

View Article and Find Full Text PDF

Similar Publications

Pyomo: Accidentally outrunning the bear.

Patterns (N Y)

July 2025

Sandia National Laboratories, Albuquerque, NM, USA.

Miranda Mundt , William E Hart , Emma S Johnson , Bethany Nicholson , John D Siirola

Pyomo is an open-source optimization modeling software that has undergone significant evolution since its inception in 2008. Pyomo has evolved to enhance flexibility, solver integration, and community engagement. Modern collaborative tools for open-source software have facilitated the development of new Pyomo functionality and improved our development process through automated testing and performance-tracking pipelines.

View Article and Find Full Text PDF

Similar Publications

Two-Step Semi-Automated Classification of Choroidal Metastases on MRI: Orbit Localization via Bounding Boxes Followed by Binary Classification via Evolutionary Strategies.

AJNR Am J Neuroradiol

September 2025

From the Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, New York, United States of America (J.S.S., B.M., S.H., A.H., J.S.), and Department of Aerospace Engineering, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India (H.S.).

Jeffrey S Shi , Bala McRae-Posani , Sofia Haque , Andrei Holodny , Hrithwik Shalu

Background And Purpose: The choroid of the eye is a rare site for metastatic tumor spread, and as small lesions on the periphery of brain MRI studies, these choroidal metastases are often missed. To improve their detection, we aimed to use artificial intelligence to distinguish between brain MRI scans containing normal orbits and choroidal metastases.

Materials And Methods: We present a novel hierarchical deep learning framework for sequential cropping and classification on brain MRI images to detect choroidal metastases.

View Article and Find Full Text PDF

Similar Publications

Toward universal immunofluorescence normalization for multiplex tissue imaging with UniFORM.

Cell Rep Methods

August 2025

Department of Biomedical Engineering and Computational Biology Program, OHSU, Portland, OR, USA; Knight Cancer Institute, OHSU, Portland, OR, USA. Electronic address:

Kunlun Wang , Kaoutar Ait-Ahmad , Sam Kupp , Zachary Sims , Eric Cramer

We present UniFORM, a non-parametric, Python-based pipeline for normalizing multiplex tissue imaging (MTI) data at both the feature and pixel levels. UniFORM employs an automated rigid landmark registration method tailored to the distributional characteristics of MTI, with UniFORM operating without prior distributional assumptions and handling both unimodal and bimodal patterns. By aligning the biologically invariant negative populations, UniFORM removes technical variation while preserving tissue-specific expression patterns in positive populations.

View Article and Find Full Text PDF

Similar Publications

Robust, Open-Source and Automation-Friendly DNA Extraction Protocol for Hologenomic Research.

Mol Ecol Resour

September 2025

Centre for Evolutionary Hologenomics (CEH), Globe Institute, University of Copenhagen, Copenhagen, Denmark.

Jonas G Lauritsen , Christian Carøe , Nanna Gaun , Garazi Martin-Bideguren , Aoife Leonard

Global efforts to standardise methodologies benefit greatly from open-source procedures that enable the generation of comparable data. Here, we present a modular, high-throughput nucleic acid extraction protocol standardised within the Earth Hologenome Initiative to generate both genomic and microbial metagenomic data from faecal samples of vertebrates. The procedure enables the purification of either RNA and DNA in separate fractions (DREX1) or as total nucleic acids (DREX2).

View Article and Find Full Text PDF

Similar Publications