Is Over-parameterization a Problem for Profile Mixture Models?

Syst Biol

Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia B3H 4R2, Canada.

Published: May 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Biochemical constraints on the admissible amino acids at specific sites in proteins lead to heterogeneity of the amino acid substitution process over sites in alignments. It is well known that phylogenetic models of protein sequence evolution that do not account for site heterogeneity are prone to long-branch attraction (LBA) artifacts. Profile mixture models were developed to model heterogeneity of preferred amino acids at sites via a finite distribution of site classes each with a distinct set of equilibrium amino acid frequencies. However, it is unknown whether the large number of parameters in such models associated with the many amino acid frequency vectors can adversely affect tree topology estimates because of over-parameterization. Here, we demonstrate theoretically that for long sequences, over-parameterization does not create problems for estimation with profile mixture models. Under mild conditions, tree, amino acid frequencies, and other model parameters converge to true values as sequence length increases, even when there are large numbers of components in the frequency profile distributions. Because large sample theory does not necessarily imply good behavior for shorter alignments we explore the performance of these models with short alignments simulated with tree topologies that are prone to LBA artifacts. We find that over-parameterization is not a problem for complex profile mixture models even when there are many amino acid frequency vectors. In fact, simple models with few site classes behave poorly. Interestingly, we also found that misspecification of the amino acid frequency vectors does not lead to increased LBA artifacts as long as the estimated cumulative distribution function of the amino acid frequencies at sites adequately approximates the true one. In contrast, misspecification of the amino acid exchangeability rates can severely negatively affect parameter estimation. Finally, we explore the effects of including in the profile mixture model an additional "F-class" representing the overall frequencies of amino acids in the data set. Surprisingly, the F-class does not help parameter estimation significantly and can decrease the probability of correct tree estimation, depending on the scenario, even though it tends to improve likelihood scores.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11129589PMC
http://dx.doi.org/10.1093/sysbio/syad063DOI Listing

Publication Analysis

Top Keywords

amino acid
32
profile mixture
20
amino acids
12
lba artifacts
12
mixture models
12
acid frequencies
12
acid frequency
12
frequency vectors
12
amino
11
over-parameterization problem
8

Similar Publications

Evaluating Amino Acid Profiles and Blood Gas Concentrations Between Single and Twin Merino Newborn Lambs.

Anim Sci J

January 2025

Davies Livestock Research Centre, School of Animal and Veterinary Sciences, The University of Adelaide, Roseworthy, South Australia, Australia.

As sheep production standards progress, and animals are bred for high production in terms of the number and weight of lambs weaned per ewe, research has identified a difference in the physiology of single lambs compared to multiple born lambs. The current study aimed to report the baseline amino acid (AA) profiles and blood gas concentrations in newborn, Merino single and twin lambs. From 120 days of gestation, 50 single-bearing and 50 twin-bearing, naturally mated Merino ewes were monitored for signs of approaching parturition.

View Article and Find Full Text PDF

Background: PPM1D (protein phosphatase Mg⁺/Mn⁺ dependent 1D) is a Ser/Thr phosphatase that negatively regulates p53 and functions as an oncogenic driver. Its gene amplification and overexpression are frequently observed in various malignancies and disruption of PPM1D degradation has also been reported as a cause of cancer progression. However, the precise mechanisms regulating PPM1D stability remain to be elucidated.

View Article and Find Full Text PDF

Background: Apples are important for human nutrition because these provide vital nutrients, including vitamins and minerals, that are needed for a balanced diet. A suitable environment for the growth and survival of various microorganisms is also provided by multiple nutrients, such as carbohydrates, minerals, vitamins, and amino acids. Penicillium spp.

View Article and Find Full Text PDF

In vivo itaconate tracing reveals degradation pathway and turnover kinetics.

Nat Metab

September 2025

Department of Bioinformatics and Biochemistry, Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany.

Itaconate is an immunomodulatory metabolite that alters mitochondrial metabolism and immune cell function. This organic acid is endogenously synthesized by tricarboxylic acid (TCA) metabolism downstream of TLR signalling. Itaconate-based treatment strategies are under investigation to mitigate numerous inflammatory conditions.

View Article and Find Full Text PDF

A rapid imaging-based screen for induced-proximity degraders identifies a potent degrader of oncoprotein SKP2.

Nat Biotechnol

September 2025

Key Laboratory of RNA Innovation, Science and Engineering, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai, China.

Targeted protein degraders hold potential as therapeutic agents to target conventionally 'undruggable' proteins. Here, we develop a high-throughput screen, DEath FUSion Escaper (DEFUSE), to identify small-molecule protein degraders. By conjugating the protein of interest to a fast-acting triggerable death protein, this approach translates target protein degradation into a cell survival phenotype to illustrate the presence of degraders.

View Article and Find Full Text PDF