98%
921
2 minutes
20
We investigate the learning of quantitative structure activity relationships (QSARs) as a case-study of meta-learning. This application area is of the highest societal importance, as it is a key step in the development of new medicines. The standard QSAR learning problem is: given a target (usually a protein) and a set of chemical compounds (small molecules) with associated bioactivities (e.g. inhibition of the target), learn a predictive mapping from molecular representation to activity. Although almost every type of machine learning method has been applied to QSAR learning there is no agreed single best way of learning QSARs, and therefore the problem area is well-suited to meta-learning. We first carried out the most comprehensive ever comparison of machine learning methods for QSAR learning: 18 regression methods, 3 molecular representations, applied to more than 2700 QSAR problems. (These results have been made publicly available on OpenML and represent a valuable resource for testing novel meta-learning methods.) We then investigated the utility of algorithm selection for QSAR problems. We found that this meta-learning approach outperformed the best individual QSAR learning method (random forests using a molecular fingerprint representation) by up to 13%, on average. We conclude that meta-learning outperforms base-learning methods for QSAR learning, and as this investigation is one of the most extensive ever comparisons of base and meta-learning methods ever made, it provides evidence for the general effectiveness of meta-learning over base-learning.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6956898 | PMC |
http://dx.doi.org/10.1007/s10994-017-5685-x | DOI Listing |
SAR QSAR Environ Res
August 2025
Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China.
Phosphorylation plays an important role in the activity of CDK2 and inhibitor binding, but the corresponding molecular mechanism is still insufficiently known. To address this gap, the current study innovatively integrates molecular dynamics (MD) simulations, deep learning (DL) techniques, and free energy landscape (FEL) analysis to systematically explore the action mechanisms of two inhibitors (SCH and CYC) when CDK2 is in a phosphorylated state and bound state of CyclinE. With the help of MD trajectory-based DL, key functional domains such as the loops L3 loop and L7 are successfully identified.
View Article and Find Full Text PDFSAR QSAR Environ Res
August 2025
Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India.
Evaluating the permeability of different molecular structures across the Caco-2 cell line is crucial for drug discovery and development. The present study primarily focuses on developing machine learning-based multiclass classification models for predicting the permeability of molecules across the Caco-2 cell line. However, the class imbalance in permeability datasets poses a significant challenge for developing predictive models in the case of multiclass analysis.
View Article and Find Full Text PDFEcotoxicol Environ Saf
September 2025
Key Laboratory of Groundwater Resources and Environment, Ministry of Education, Jilin Provincial Key Laboratory of Water Resources and Environment, College of New Energy and Environment, Jilin University, Changchun 130012, China.
Liquid crystal monomers (LCMs) have emerged as novel endocrine disrupting chemicals that affect the growth, development, and metabolism of organisms by binding to nuclear hormone receptors (NHRs). However, the studies on the impact of LCMs' molecular features on their binding affinities remain limited. In this study, considering the challenge of activity cliffs in linear quantitative structure-activity relationship modeling, a multidimensional feature fusion model was developed to predict the binding affinities of 1173 LCMs to 15 NHRs.
View Article and Find Full Text PDFAnticancer Agents Med Chem
September 2025
Department of Bioscience and Biotechnology, Banasthali Vidyapith, Rajasthan-304022, India.
Introduction: Microbial metabolites represent a valuable source of bioactive compounds with promising anticancer properties. However, conventional drug discovery approaches are time-intensive and resource-demanding.
Methods: Recent developments in artificial intelligence (AI), machine learning (ML), molecular docking, and quantitative structure-activity relationship (QSAR) modeling have been examined for their role in the identification and optimization of microbial metabolites.
Nanoscale Horiz
September 2025
University of Gdansk, Faculty of Chemistry, Laboratory of Environmental Chemoinformatics, Wita Stwosza 63, 80-308 Gdansk, Poland.
The primary aim of our study was to address the problem of transcriptomic data complexity by introducing a novel transcriptomic response index (TRI), compressing the entire transcriptomic space into a single variable, and linking it with the inhaled multiwalled carbon nanotubes (MWCNTs) properties. This methodology allows us to predict fold change values of thousands of differentially expressed genes (DEGs) using a single variable and a single quantitative structure-activity relationship (QSAR) model. In the context of this work, TRI compressed 5167 DEGs into a single variable, explaining 99.
View Article and Find Full Text PDF