98%
921
2 minutes
20
Background: While alignment has traditionally been the primary approach for establishing homology prior to phylogenetic inference, alignment-free methods offer a simplified alternative, particularly beneficial when handling genome-wide data involving long sequences and complex events such as rearrangements. Moreover, alignment-free methods become crucial for data types like genome skims, where assembly is impractical. However, despite these benefits, alignment-free techniques have not gained widespread acceptance since they lack the accuracy of alignment-based techniques, primarily due to their reliance on simplified models of pairwise distance calculation.
Results: Here, we present a likelihood based alignment-free technique for phylogenetic tree construction. We encode the presence or absence of k-mers in genome sequences in a binary matrix, and estimate phylogenetic trees using a maximum likelihood approach. A likelihood based alignment-free method for phylogeny estimation is implemented for the first time in a software named PEAFOWL, which is available at: https://github.com/hasin-abrar/Peafowl-repo . We analyze the performance of our method on seven real datasets and compare the results with the state of the art alignment-free methods.
Conclusions: Results suggest that our method is competitive with existing alignment-free tools. This indicates that maximum likelihood based alignment-free methods may in the future be refined to outperform alignment-free methods relying on distance calculation as has been the case in the alignment-based setting.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11887328 | PMC |
http://dx.doi.org/10.1186/s12859-025-06080-w | DOI Listing |
Fungal Biol
October 2025
Engineering Bioprocess and Biotechnology Post-Graduation Program, Department of Bioprocess Engineering and Biotechnology, Federal University of Parana, Curitiba, Paraná, Brazil. Electronic address:
Lichens exemplify a unique symbiotic relationship between fungi and algae or cyanobacteria, where fungi (mycobionts) provide structural support, while algae or cyanobacteria (photobionts) provide nutrients. Recent discoveries in the order Chaetothyriales have led to the description of several lichenicolous species, underscoring an intricate relationship of some black yeast-like fungi with lichens. The present study aims to investigate public metagenomic data of lichens available in the SRA database, covering a total of 2888 samples.
View Article and Find Full Text PDFBrief Bioinform
September 2025
Beijing Institute of Mathematical Sciences and Applications (BIMSA), Beijing 101408, P. R. China.
With the rapid development of genomic sequencing technologies, there is an increasing demand for efficient and accurate sequence analysis methods. However, existing methods face challenges in handling long, variable-length sequences and large-scale datasets. To address these issues, we propose a novel encoding method-Energy Entropy Vector (EEV).
View Article and Find Full Text PDFBMC Bioinformatics
September 2025
Genome Informatics, Faculty of Technology and Center for Biotechnology, Bielefeld University, 33615, Bielefeld, Germany.
Background: The increasing amount of available genome sequence data enables large-scale comparative studies. A common task is the inference of phylogenies- a challenging task if close reference sequences are not available, genome sequences are incompletely assembled, or the high number of genomes precludes multiple sequence alignment in reasonable time. SANS is an alignment-free, whole-genome based approach for phylogeny estimation.
View Article and Find Full Text PDFBMC Bioinformatics
September 2025
Computational Chemical Biology Laboratory, Department of BioMolecular Sciences, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, 14040-900, Brazil.
Antimicrobial resistance (AMR) is one of the most concerning modern threats as it places a greater burden on health systems than HIV and malaria combined. Current surveillance strategies for tracking antimicrobial resistance (AMR) rely on genomic comparisons and depend on sequence alignment with strict similarity cutoffs of greater than 95%. Therefore, these methods have high false-negative error rates due to a lack of reference sequences with a representative coverage of AMR protein diversity.
View Article and Find Full Text PDFJ Mol Biol
August 2025
Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China. Electronic address:
The advent of highly accurate protein structure prediction methods has fueled an exponential expansion of the protein structure database. Consequently, there is a rising demand for rapid and precise protein structure search. Traditional alignment-based methods are designed for precise pairwise comparisons, offering high accuracy.
View Article and Find Full Text PDF