Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Despite its importance in understanding biology and computer-aided drug discovery, the accurate prediction of protein ionization states remains a formidable challenge. Physics-based approaches struggle to capture the small, competing contributions in the complex protein environment, while machine learning (ML) is hampered by scarcity of experimental data. Here we report the development of p ML (KaML) models based on decision trees and graph attention networks (GAT), exploiting physicochemical understanding and a new experiment p database (PKAD-3) enriched with highly shifted p 's. KaML-CBtree significantly outperforms the current state of the art in predicting p values and ionization states across all six titratable amino acids, notably achieving accurate predictions for deprotonated cysteines and lysines - a blind spot in previous models. The superior performance of KaMLs is achieved in part through several innovations, including separate treatment of acid and base, data augmentation using AlphaFold structures, and model pretraining on a theoretical p database. We also introduce the classification of protonation states as a metric for evaluating p prediction models. A meta-feature analysis suggests a possible reason for the lightweight tree model to outperform the more complex deep learning GAT. We release an end-to-end p predictor based on KaML-CBtree and the new PKAD-3 database, which facilitates a variety of applications and provides the foundation for further advances in protein electrostatics research.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11601431PMC
http://dx.doi.org/10.1101/2024.11.09.622800DOI Listing

Publication Analysis

Top Keywords

ionization states
12
values ionization
8
kamls predicting
4
protein
4
predicting protein
4
protein values
4
states
4
states trees
4
trees need?
4
need? despite
4

Similar Publications

Slimmer Geminals For Accurate F12 Electronic Structure Models.

J Chem Theory Comput

September 2025

Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24060, United States.

The Slater-type F12 geminal length scales originally tuned for the second-order Mo̷ller-Plesset F12 method are too large for higher-order F12 methods formulated using the SP (diagonal fixed-coefficient spin-adapted) F12 ansatz. The new geminal parameters reported herein reduce the basis set incompleteness errors (BSIEs) of absolute coupled-cluster singles and doubles F12 correlation energies by a significant─and increase with the cardinal number of the basis─margin. The effect of geminal reoptimization is especially pronounced for the cc-pVZ-F12 basis sets (specifically designed for use with F12 methods) relative to their conventional aug-cc-pVZ counterparts.

View Article and Find Full Text PDF

We derive the coupled-cluster doubles (CCD) amplitude equations by introduction of the particle-hole-time decoupled electronic self-energy. The resulting analysis leads to an expression for the ground-state correlation energy that is exactly of the form obtained in coupled-cluster doubles theory. We demonstrate the relationship to the ionization potential/electron affinity equation-of-motion coupled-cluster doubles (IP/EA-EOM-CCD) eigenvalue problem by coupling the reverse-time self-energy contributions while maintaining particle-hole separability.

View Article and Find Full Text PDF

Unusual Core-Ionization Pathways in Hydrated Na: A Theoretical KV Study.

Inorg Chem

September 2025

Laboratoire de Chimie Physique Matière et Rayonnement (LCPMR), CNRS UMR 7614, Sorbonne Université (SU), 4 place Jussieu, Paris 75005, France.

The one-photon KV X-ray photoelectron spectra of Na and its hydrated clusters [Na(HO)] ( = 1-6) are dominated by the unusual 1s → 1s3s transition. KV spectroscopy also reveals a pronounced redistribution of the 1s → 1s3p transition cross sections, directly correlated with hydration number and molecular arrangement. Its intrinsic two-step nature, involving simultaneous core ionization and core excitation, enables detailed investigation of solvation-induced electronic structure changes, including dipole-forbidden excitations, core-valence charge transfer, and subtle 1s → V energy shifts.

View Article and Find Full Text PDF

Radiation-induced single event effects in vertically prolonged drain dual gate Si Ge source TFET.

J Mol Model

September 2025

Department of Electronics and Communication Engineering, National Institute of Technology Patna, Patna, Bihar, India.

Context: This study investigates the radiation tolerance of a SiGe source vertical tunnel field effect transistor (VTFET) under heavy ion-induced single event effects (SEEs). Single event effects (SEEs) occur when high-energy particles interact with semiconductor devices, leading to unintended behavior. The effect of high energy ions on the VTFET is examined for various linear energy transfer (LET) values and at multiple ion hit locations.

View Article and Find Full Text PDF

We model Auger spectra using second-order Møller-Plesset perturbation (MP2) theory combined with complex-scaled basis functions. For this purpose, we decompose the complex MP2 energy of the core-hole state into contributions from specific decay channels and propose a corresponding equation-of-motion (EOM) method for computing the doubly ionized final states of Auger decay. These methods lead to significant savings in computational cost compared to our recently developed approaches based on coupled-cluster theory [F.

View Article and Find Full Text PDF