Computing evolutionary distinctiveness indices in large scale analysis.

Algorithms Mol Biol

IRMACS and BioSciences, Simon Fraser University, 8888 University Drive, Burnaby, V5A 1S6 Canada.

Published: April 2012


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

We present optimal linear time algorithms for computing the Shapley values and 'heightened evolutionary distinctiveness' (HED) scores for the set of taxa in a phylogenetic tree. We demonstrate the efficiency of these new algorithms by applying them to a set of 10,000 reasonable 5139-species mammal trees. This is the first time these indices have been computed on such a large taxon and we contrast our finding with an ad-hoc index for mammals, fair proportion (FP), used by the Zoological Society of London's EDGE programme. Our empirical results follow expectations. In particular, the Shapley values are very strongly correlated with the FP scores, but provide a higher weight to the few monotremes that comprise the sister to all other mammals. We also find that the HED score, which measures a species' unique contribution to future subsets as function of the probability that close relatives will go extinct, is very sensitive to the estimated probabilities. When they are low, HED scores are less than FP scores, and approach the simple measure of a species' age. Deviations (like the Solendon genus of the West Indies) occur when sister species are both at high risk of extinction and their clade roots deep in the tree. Conversely, when endangered species have higher probabilities of being lost, HED scores can be greater than FP scores and species like the African elephant Loxondonta africana, the two solendons and the thumbless bat Furipterus horrens can move up the rankings. We suggest that conservation attention be applied to such species that carry genetic responsibility for imperiled close relatives. We also briefly discuss extensions of Shapley values and HED scores that are possible with the algorithms presented here.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3353162PMC
http://dx.doi.org/10.1186/1748-7188-7-6DOI Listing

Publication Analysis

Top Keywords

shapley values
12
close relatives
8
scores
7
computing evolutionary
4
evolutionary distinctiveness
4
distinctiveness indices
4
indices large
4
large scale
4
scale analysis
4
analysis optimal
4

Similar Publications

A machine learning model for predicting the probability of hypothermia in trauma patients: a multi-center retrospective cohort study.

Biomed Eng Lett

September 2025

Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, 300384 China.

Hypothermia, a component of the "lethal triad," commonly complicates the condition of critically injured trauma patients, thereby substantially elevating the risk of mortality. This study develop and evaluate a dynamic warning system based on non-invasive features, aimed at predicting the likelihood of hypothermia occurring in trauma patients within the next hour. 462 patients from the eICU database were selected on the basis of meeting the inclusion criteria, and 19 non-invasive and 17 invasive features were extracted.

View Article and Find Full Text PDF

Background: Atherosclerosis (AS) is a leading risk factor for cardiovascular diseases globally, characterised by the accumulation of lipids and cholesterol in arterial walls, causing vascular narrowing and sclerosis along with chronic inflammation; this leads to increased risk of heart disease and stroke, significantly impacting patients' health. Danxia Tiaoban Decoction (DXTB), a traditional Chinese medicine (TCM) formula, has demonstrated positive clinical effects in treating AS; however, its mechanisms of action remain unclear.

Objective: To explore the potential mechanisms of action of DXTB in treating AS through multi-omics integration and experimental validation.

View Article and Find Full Text PDF

Background: Atrial fibrillation (AF) and heart failure (HF) frequently coexist in patients, with the development of AF often preceding HF decompensation. We sought to evaluate whether daily remote monitoring of ICD parameters could predict AF occurrence using machine learning techniques in a real-world cohort.

Methods: Data from patients with primary prevention ICDs transmitted daily to the Northwell centralized remote monitoring center between 2012 and 2021 were extracted.

View Article and Find Full Text PDF

Investigating pedestrian crash injury patterns: A comparative study of children and non-children.

Accid Anal Prev

September 2025

Industrial and Manufacturing Systems Engineering Department, University of Michigan-Dearborn, 4901 Evergreen Rd, Dearborn, 48128, MI, USA; University of Michigan Transportation Research Institute, 2901 Baxter Rd, Ann Arbor, 48109, MI, USA. Electronic address:

Pedestrian injuries remain a public health concern, with child pedestrians being particularly vulnerable due to their unique physical and cognitive characteristics. This study presents a comprehensive analysis comparing injury severity patterns between child (≤14 years) and non-child (>14 years) pedestrians using Lasso logistic regression and advanced machine learning techniques, specifically Catboost with SHAP (SHapley Additive exPlanations) values to interpret the models. By analyzing six years of national crash data from the Crash Report Sampling System (CRSS) from 2016 to 2021, we identify significant factors influencing injury outcomes for both age groups.

View Article and Find Full Text PDF

Purpose: South Korea-despite its "drug-free" reputation-exhibits an increasing incidence of drug use, particularly among youths. In this age group, both environmental and individual factors influence illegal drug use. This study aimed to explore the prevalence of illicit drug use and examine the association between individual and environmental factors and drug use among Korean youths.

View Article and Find Full Text PDF