Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Retention time (RT) can provide orthogonal information to mass spectra, supporting the qualitative identification. However, RT is influenced by experimental conditions and column parameters, and it is difficult to have a large amount of RT data in the user's experimental conditions. Hence, various machine learning methods, including advanced deep learning approaches, have been developed for RT prediction. However, most of them were limited to a given column and operational conditions. In the meantime, data sparsity often hinders the prediction performance. In this study, we propose an MDL-TL method that combines multiple data sets to jointly train the base model. MDL-TL vectorizes the column and conditions (chromatographic parameters, CPs) using word2vec and autoencoders, and distinguishes the data sets from different chromatographic experiments by including the CPs in the compound representation. This not only augments the data but also introduces the CPs into the RT prediction, allowing the pretrained model to be efficiently transferred to different target systems by fine-tuning. MDL-TL was evaluated against five popular deep learning approaches and four machine learning approaches on 14 reversed-phase liquid chromatography data sets and 14 hydrophilic interaction liquid chromatography data sets, respectively. The results show that our method surpassed the compared methods, including transfer learning methods based on the METLIN small molecule retention time (SMRT) data set, in mean absolute error, median absolute error, mean relative error, and in most cases, demonstrating that MDL-TL is a promising approach for predicting RTs for various chromatographic systems and operational conditions.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.analchem.5c01703DOI Listing

Publication Analysis

Top Keywords

data sets
20
retention time
12
learning approaches
12
data
9
multiple data
8
sets chromatographic
8
transfer learning
8
experimental conditions
8
machine learning
8
learning methods
8

Similar Publications

Solvation Structure of Np in a Noncomplexing Environment.

Inorg Chem

September 2025

Pacific Northwest National Laboratory, Richland, Washington 99352, United States.

The solvation structure of an Np ion in an aqueous, noncomplexing and nonoxidizing environment of trifluoromethanesulfonic (triflic) acid was investigated with X-ray absorption spectroscopy (XAS) combined with ab initio molecular dynamics (AIMD) and time-dependent density functional theory (TDDFT) calculations. Np L-edge X-ray absorption near-edge structure (XANES) and extended X-ray absorption fine structure (EXAFS) data were collected for Np in 1, 3, and 7 M triflic acid using a laboratory-scale spectrometer and separately at a synchrotron facility, producing data sets in excellent agreement. TDDFT calculations revealed a weak pre-edge feature not previously reported for Np L-edge XANES.

View Article and Find Full Text PDF

In the zebrafish larval toxicity model, phenotypic changes induced by chemical exposure can potentially be explained and predicted by the analysis of gene expression changes at sub-phenotypic concentrations. The increase in knowledge of gene pathway-specific effects arising from the zebrafish transcriptomic model has the potential to enhance the role of the larval zebrafish as a component of Integrated Approaches to Testing and Assessment (IATA). In this paper, we compared the transcriptomic responses of triphenyl phosphate between two standard exposure paradigms, the Zebrafish Embryo Toxicity (ZET) and General and Behavioural Toxicity (GBT) assays.

View Article and Find Full Text PDF

Summary: In Bayesian phylogenetic and phylodynamic studies it is common to summarise the posterior distribution of trees with a time-calibrated summary phylogeny. While the maximum clade credibility (MCC) tree is often used for this purpose, we here show that a novel summary tree method-the highest independent posterior subtree reconstruction, or HIPSTR-contains consistently higher supported clades over MCC. We also provide faster computational routines for estimating both summary trees in an updated version of TreeAnnotator X, an open-source software program that summarizes the information from a sample of trees and returns many helpful statistics such as individual clade credibilities contained in the summary tree.

View Article and Find Full Text PDF

PERC: a suite of software tools for the curation of cryoEM data with application to simulation, modeling and machine learning.

Acta Crystallogr F Struct Biol Commun

October 2025

Science and Technology Facilities Council, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom.

Ease of access to data, tools and models expedites scientific research. In structural biology there are now numerous open repositories of experimental and simulated data sets. Being able to easily access and utilize these is crucial to allow researchers to make optimal use of their research effort.

View Article and Find Full Text PDF

Exploring the Frontiers of Computational NMR: Methods, Applications, and Challenges.

Chem Rev

September 2025

Center for Computational Life Sciences, Lerner Research Institute, The Cleveland Clinic, Cleveland, Ohio 44195, United States.

Computational methods have revolutionized NMR spectroscopy, driving significant advancements in structural biology and related fields. This review focuses on recent developments in quantum chemical and machine learning approaches for computational NMR, emphasizing their role in enhancing accuracy, efficiency, and scalability. QM methods provide precise predictions of NMR parameters, enabling detailed structural characterization of diverse systems.

View Article and Find Full Text PDF