Kssdtree: an interactive Python package for phylogenetic analysis based on sketching technique.

Bioinformatics

Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518055, China.

Published: October 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Summary: Sketching technologies have recently emerged as a promising solution for real-time, large-scale phylogenetic analysis. However, existing sketching-based phylogenetic tools exhibit drawbacks, including platform restrictions, deficiencies in tree visualization, and inherent distance estimation bias. These limitations collectively impede the overall convenience and efficiency of the analysis. In this study, we introduce Kssdtree, an interactive Python package designed to address these challenges. Kssdtree surpasses other sketching-based tools by demonstrating superior performance in terms of both accuracy and time efficiency on comprehensive benchmarking datasets. Notably, Kssdtree offers key advantages such as intra-species phylogenomic analysis and GTDB-based phylogenetic placement analysis, significantly enhancing the scope and depth of phylogenetic investigations. Through extensive evaluations and comparisons, Kssdtree stands out as an efficient and versatile method for real-time, large-scale phylogenetic analysis.

Availability And Implementation: The Kssdtree Python package is freely accessible at https://pypi.org/project/kssdtree and source code is available at https://github.com/yhlink/kssdtree. The documentation and instantiation for the software is available at https://kssdtree.readthedocs.io/en/latest. The video tutorial is available at https://youtu.be/_6hg59Yn-Ws.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11467128PMC
http://dx.doi.org/10.1093/bioinformatics/btae566DOI Listing

Publication Analysis

Top Keywords

python package
12
kssdtree interactive
8
interactive python
8
phylogenetic analysis
8
real-time large-scale
8
large-scale phylogenetic
8
kssdtree
6
phylogenetic
6
analysis
5
package phylogenetic
4

Similar Publications

Genomes are composed of a mosaic of segments inherited from different ancestors, each separated by past recombination events. Consequently, genealogical relationships among multiple genomes vary spatially across different genomic regions. Genealogical variation among unlinked (uncorrelated) genomic regions is well described for either a single population (coalescent) or multiple structured populations (multispecies coalescent).

View Article and Find Full Text PDF

PERC: a suite of software tools for the curation of cryoEM data with application to simulation, modeling and machine learning.

Acta Crystallogr F Struct Biol Commun

October 2025

Science and Technology Facilities Council, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom.

Ease of access to data, tools and models expedites scientific research. In structural biology there are now numerous open repositories of experimental and simulated data sets. Being able to easily access and utilize these is crucial to allow researchers to make optimal use of their research effort.

View Article and Find Full Text PDF

Genome-scale metabolic models (GEMs) are widely used in systems biology to investigate metabolism and predict perturbation responses. Automatic GEM reconstruction tools generate GEMs with different properties and predictive capacities for the same organism. Since different models can excel at different tasks, combining them can increase metabolic network certainty and enhance model performance.

View Article and Find Full Text PDF

A model-free method for genealogical inference without phasing and its application for topology weighting.

Genetics

September 2025

Institute of Ecology and Evolution, School of Biological Sciences, The University of Edinburgh, Edinburgh, EH9 3FL, United Kingdom.

Recent advances in methods to infer and analyse ancestral recombination graphs (ARGs) are providing powerful new insights in evolutionary biology and beyond. Existing inference approaches tend to be designed for use with fully-phased datasets, and some rely on model assumptions about demography and recombination rate. Here I describe a simple model-free approach for genealogical inference along the genome from unphased genotype data called Sequential Tree Inference by Collecting Compatible Sites (sticcs).

View Article and Find Full Text PDF

Ultra-Processed Foods and Increased High Sensitivity C-reactive Protein.

Am J Med

September 2025

Professor and Chair, Department of Medicine, Program Director, Internal Medicine Residency Program, Assistant Dean of Faculty Development, Charles E. Schmidt College of Medicine, Florida Atlantic University, 777 Glades Rd, Boca Raton, FL 33431. Electronic address:

Objective: To explore whether people with increased consumption of ultra-processed foods have significantly increased high sensitivity C-reactive protein (hs-CRP), a sensitive inflammatory marker and accurate predictor of cardiovascular disease.

Methods: United States (US) National Health and Nutrition Examination Survey, a nationally representative sample of 9,254 that included ultra-processed foods as percentage of total energy intake using the validated NOVA classification system. We used means and percentages as measures of effect, and 95% confidence intervals (CI) (p<0.

View Article and Find Full Text PDF