Evaluation of chirality descriptors derived from SMILES heteroencoders.

J Cheminform

LAQV and REQUIMTE, Chemistry Department, NOVA School of Science and Technology, Universidade Nova de Lisboa, 2829-516, Caparica, Portugal.

Published: August 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Molecular representations of chirality, derived from latent space vectors (LSVs) of SMILES heteroencoders, were explored to train machine learning models to predict chiral properties, and were compared to conventional circular fingerprints. Latent space arithmetic was applied to enhance the representation of chirality, by calculating differences between the original descriptor of a molecule and the descriptor of its enantiomer, or the difference between the original descriptor and the descriptor obtained with the stereochemistry-depleted SMILES string. Machine learning was performed with the Random Forest algorithm applied to a dataset of 3858 molecules extracted from the literature (1929 pairs of enantiomers) to predict the elution order observed on the Chiralpak® AD-H column, as well as intrinsic structural chirality labels (R/S or canonical SMILES @/@@). The descriptors derived from the heteroencoders achieved an accuracy of up to 0.75 in the prediction of the elution order, and the fingerprints were superior (0.82). A better predictive ability was observed with the difference LSV descriptors than with the original descriptors.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12398957PMC
http://dx.doi.org/10.1186/s13321-025-01080-7DOI Listing

Publication Analysis

Top Keywords

descriptors derived
8
smiles heteroencoders
8
latent space
8
machine learning
8
original descriptor
8
elution order
8
evaluation chirality
4
descriptors
4
chirality descriptors
4
derived smiles
4

Similar Publications

Understanding the shape of chemistry data-Applications with persistent homology.

J Chem Phys

September 2025

Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, USA.

Chemical data often have complex and nonlinear patterns in how data points relate to one another. Concurrently, there are many situations where chemical data are of high dimensionality (e.g.

View Article and Find Full Text PDF

Introduction And Aim: Oral squamous cell carcinomas (OSCCs) are one of the most frequently diagnosed head and neck cancers with a poor prognosis despite the advancements in diagnostic techniques and treatment strategies. The progression of OSCC is driven by several molecular mechanisms, among them the overexpression of transcription factor RelA, which plays a crucial role by correlating with the clinicopathological characteristics.

Methods: This systematic investigation focused on identifying the top 25 crucial molecular descriptors to predict the RelA inhibitor through the quantitative structure-activity relationship (QSAR)-based artificial neural network model.

View Article and Find Full Text PDF

The newly designed complexes of [[Pd(dach)(phendione)](NO) (1), [Pd(bpy)(phendione)](NO) (2), and [Pd(dpa)(phendione)](NO) (3) (where dach is 1,2-diaminocyclohexane, phendione is 1,10-phenanthroline-5,6-dione, bpy is 2,2'-bipyridine, and dpa is 2,2'-dipyridylamine) were synthesized and characterized by various techniques such as FT-IR, H NMR, 2DH NMR, DO exchange, UV-Vis spectroscopy, and elemental analysis. The theoretical studies (DFT approach) supported the formation of these complexes. The cytotoxic effects of these complexes (1, 2, and 3) on two different cell types, ovarian cancer-derived CHO cells and NIH/3 T3 fibroblasts (normal cells), were investigated and compared with cisplatin.

View Article and Find Full Text PDF

Thermodynamics-inspired high-entropy oxide synthesis.

Nat Commun

September 2025

Department of Materials Science and Engineering, The Pennsylvania State University, University Park, PA, USA.

High-entropy oxide (HEO) thermodynamics transcend temperature-centric approaches, spanning a multidimensional landscape where oxygen chemical potential plays a decisive role. Here, we experimentally demonstrate how controlling the oxygen chemical potential coerces multivalent cations into divalent states in rock salt HEOs. We construct a preferred valence phase diagram based on thermodynamic stability and equilibrium analysis, alongside a high throughput enthalpic stability map derived from atomistic calculations leveraging machine learning interatomic potentials.

View Article and Find Full Text PDF

Oxygen exchange on mixed conducting oxide surfaces and how to modulate its kinetics has been in the focus of research for decades. Recent studies have shown that surface modifications can be used to tune the high temperature oxygen exchange kinetics of a single material systematically over several orders of magnitude, shifting the focus of research from bulk descriptors to a material's outermost surface. Herein, we aim to unify bulk and surface perspectives and derive general design principles for fast oxygen exchange based on three fundamental material properties: oxide reducibility, adsorption energetics, and surface acidity.

View Article and Find Full Text PDF