Exploring the chemical subspace of RPLC: A data driven approach.

Anal Chim Acta

Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, 1098 XH, the Netherlands; UvA Data Science Center, University of Amsterdam, Amsterdam, 1012 WP, the Netherlands. Electronic address:

Published: August 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: The chemical space is comprised of a vast number of possible structures, of which an unknown portion comprises the human and environmental exposome. Such samples are frequently analyzed using non-targeted analysis via liquid chromatography (LC) coupled to high-resolution mass spectrometry often employing a reversed phase (RP) column. However, prior to analysis, the contents of these samples are unknown and could be comprised of thousands of known and unknown chemical constituents. Moreover, it is unknown which part of the chemical space is sufficiently retained and eluted using RPLC.

Results: We present a generic framework that uses a data driven approach to predict whether molecules fall 'inside', 'maybe' inside, or 'outside' of the RPLC subspace. Firstly, three retention index random forest (RF) regression models were constructed that showed that molecular fingerprints are able to predict RPLC retention behavior. Secondly, these models were used to set up the dataset for building an RPLC RF classification model. The RPLC classification model was able to correctly predict whether a chemical belonged to the RPLC subspace with an accuracy of 92% for the testing set. Finally, applying this model to the 91 737 small molecules (i.e., ≤1 000 Da) in NORMAN SusDat showed that 19.1% fall 'outside' of the RPLC subspace.

Significance And Novelty: The RPLC chemical space model provides a major step towards mapping the chemical space and is able to assess whether chemicals can potentially be measured with an RPLC method (i.e., not every RPLC method) or if a different selectivity should be considered. Moreover, knowing which chemicals are outside of the RPLC subspace can assist in reducing potential candidates for library searching and avoid screening for chemicals that will not be present in RPLC data.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.aca.2024.342869DOI Listing

Publication Analysis

Top Keywords

chemical space
16
rplc
12
rplc subspace
12
rplc data
8
data driven
8
driven approach
8
unknown chemical
8
'outside' rplc
8
rplc classification
8
classification model
8

Similar Publications

Quinoline as a Photochemical Toolbox: From Substrate to Catalyst and Beyond.

Acc Chem Res

September 2025

Department of Chemistry, FRQNT Centre for Green Chemistry and Catalysis, McGill University, 801 Sherbrooke Street W, Montréal, Québec H3A 0B8, Canada.

ConspectusMolecular photochemistry, by harnessing the excited states of organic molecules, provides a platform fundamentally distinct from thermochemistry for generating reactive open-shell or spin-active species under mild conditions. Among its diverse applications, the resurgence of the Minisci-type reaction, a transformation historically reliant on thermally initiated radical conditions, has been fueled by modern photochemical strategies with improved efficiency and selectivity. Consequently, the photochemical Minisci-type reaction ranks among the most enabling methods for C()-H functionalizations of heteroarenes, which are of particular significance in medicinal chemistry for the rapid diversification of bioactive scaffolds.

View Article and Find Full Text PDF

The challenge of photocatalytic hydrogen production has motivated a targeted search for MXenes as a promising class of materials for this transformation because of their high mobility and high light absorption. High-throughput screening has been widely used to discover new materials, but the relatively high cost limits the chemical space for searching MXenes. We developed a deep-learning-enabled high-throughput screening approach that identified 14 stable candidates with suitable band alignment for water splitting from 23 857 MXenes.

View Article and Find Full Text PDF

Directed message passing neural networks enhanced graph convolutional learning for accurate polymer density prediction.

J Chem Phys

September 2025

National Synchrotron Radiation Laboratory, State Key Laboratory of Advanced Glass Materials, Anhui Provincial Engineering Research Center for Advanced Functional Polymer Films, University of Science and Technology of China, Hefei, Anhui 230029, China.

Polymer density is a critical factor influencing material performance and industrial applications, and it can be tailored by modifying the chemical structure of repeating units. Traditional polymer density characterization methods rely heavily on domain expertise; however, the vast chemical space comprising over one million potential polymer structures makes conventional experimental screening inefficient and costly. In this study, we proposed a machine learning framework for polymer density prediction, rigorously evaluating four models: neural networks (NNs), random forest (RF), XGBoost, and graph convolutional neural networks (GCNNs).

View Article and Find Full Text PDF

Purpose: Gadoxetic acid-enhanced hepatobiliary phase T-weighted (Tw) MRI is effective for the detection of focal liver lesions but lacks sufficient T contrast to distinguish benign from malignant lesions. Although the addition of T, diffusion, and dynamic contrast-enhanced Tw imaging improves lesion characterization, these methods often do not provide adequate spatial resolution to identify subcentimeter lesions. This work proposes a high-resolution, volumetric, free-breathing liver MRI method that produces colocalized fat-suppressed, variable Tw images from a single acquisition, thereby improving both lesion detection and characterization.

View Article and Find Full Text PDF

Long-duration spaceflight exposes astronauts to various stressors that can alter human physiology, potentially causing immediate and long-term health effects. These stressors can damage biomolecules, cells, tissues, and organs, leading to adverse outcomes. Developing adverse outcome pathways (AOPs) relevant to radiation exposure can guide research priorities and inform risk assessments of future space exploration activities.

View Article and Find Full Text PDF