Meta-QSAR: a large-scale application of meta-learning to drug design and discovery.

Ivan Olier , Noureddin Sadawi , G Richard Bickerton , Joaquin Vanschoren , Crina Grosan , Larisa Soldatova , Ross D King

Mach Learn

2University of Manchester, Manchester, UK.

Published: December 2017

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

We investigate the learning of quantitative structure activity relationships (QSARs) as a case-study of meta-learning. This application area is of the highest societal importance, as it is a key step in the development of new medicines. The standard QSAR learning problem is: given a target (usually a protein) and a set of chemical compounds (small molecules) with associated bioactivities (e.g. inhibition of the target), learn a predictive mapping from molecular representation to activity. Although almost every type of machine learning method has been applied to QSAR learning there is no agreed single best way of learning QSARs, and therefore the problem area is well-suited to meta-learning. We first carried out the most comprehensive ever comparison of machine learning methods for QSAR learning: 18 regression methods, 3 molecular representations, applied to more than 2700 QSAR problems. (These results have been made publicly available on OpenML and represent a valuable resource for testing novel meta-learning methods.) We then investigated the utility of algorithm selection for QSAR problems. We found that this meta-learning approach outperformed the best individual QSAR learning method (random forests using a molecular fingerprint representation) by up to 13%, on average. We conclude that meta-learning outperforms base-learning methods for QSAR learning, and as this investigation is one of the most extensive ever comparisons of base and meta-learning methods ever made, it provides evidence for the general effectiveness of meta-learning over base-learning.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6956898	PMC
http://dx.doi.org/10.1007/s10994-017-5685-x	DOI Listing

Publication Analysis

Top Keywords

qsar learning

learning

meta-learning

machine learning

learning method

methods qsar

qsar problems

meta-learning methods

qsar

methods

Similar Publications

Unravelling phosphorylation-induced impacts on inhibitor-CDK2 through multiple independent molecular dynamics simulations and deep learning.

SAR QSAR Environ Res

August 2025

Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China.

W Zhang , G Xu , X Li , J Cong , P Wang

Phosphorylation plays an important role in the activity of CDK2 and inhibitor binding, but the corresponding molecular mechanism is still insufficiently known. To address this gap, the current study innovatively integrates molecular dynamics (MD) simulations, deep learning (DL) techniques, and free energy landscape (FEL) analysis to systematically explore the action mechanisms of two inhibitors (SCH and CYC) when CDK2 is in a phosphorylated state and bound state of CyclinE. With the help of MD trajectory-based DL, key functional domains such as the loops L3 loop and L7 are successfully identified.

View Article and Find Full Text PDF

Similar Publications

First report on machine learning based multiclass classification of Caco-2 permeability using different balancing strategies.

SAR QSAR Environ Res

August 2025

Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India.

I Dasgupta , S Gayen

Evaluating the permeability of different molecular structures across the Caco-2 cell line is crucial for drug discovery and development. The present study primarily focuses on developing machine learning-based multiclass classification models for predicting the permeability of molecules across the Caco-2 cell line. However, the class imbalance in permeability datasets poses a significant challenge for developing predictive models in the case of multiclass analysis.

View Article and Find Full Text PDF

Similar Publications

Predicting binding affinities of liquid crystal monomers: An activity cliffs-driven multidimensional feature fusion model.

Ecotoxicol Environ Saf

September 2025

Key Laboratory of Groundwater Resources and Environment, Ministry of Education, Jilin Provincial Key Laboratory of Water Resources and Environment, College of New Energy and Environment, Jilin University, Changchun 130012, China.

Han Zhang , Zhiyong Guo , Ruilin Wang , Liwen Zhang , Deming Dong

Liquid crystal monomers (LCMs) have emerged as novel endocrine disrupting chemicals that affect the growth, development, and metabolism of organisms by binding to nuclear hormone receptors (NHRs). However, the studies on the impact of LCMs' molecular features on their binding affinities remain limited. In this study, considering the challenge of activity cliffs in linear quantitative structure-activity relationship modeling, a multidimensional feature fusion model was developed to predict the binding affinities of 1173 LCMs to 15 NHRs.

View Article and Find Full Text PDF

Similar Publications

Microbial-Derived Anti-Cancer Compounds: Advances in Drug Discovery, Bioengineering, and Therapeutic Applications.

Anticancer Agents Med Chem

September 2025

Department of Bioscience and Biotechnology, Banasthali Vidyapith, Rajasthan-304022, India.

Ekta Tyagi , Divya Jain , Rajabrata Bhuyan , Anand Prakash

Introduction: Microbial metabolites represent a valuable source of bioactive compounds with promising anticancer properties. However, conventional drug discovery approaches are time-intensive and resource-demanding.

Methods: Recent developments in artificial intelligence (AI), machine learning (ML), molecular docking, and quantitative structure-activity relationship (QSAR) modeling have been examined for their role in the identification and optimization of microbial metabolites.

View Article and Find Full Text PDF

Similar Publications

TRIumph in nanotoxicology: simplifying transcriptomics into a single predictive variable.

Nanoscale Horiz

September 2025

University of Gdansk, Faculty of Chemistry, Laboratory of Environmental Chemoinformatics, Wita Stwosza 63, 80-308 Gdansk, Poland.

Viacheslav Muratov , Karolina Jagiello , Tomasz Puzyn

The primary aim of our study was to address the problem of transcriptomic data complexity by introducing a novel transcriptomic response index (TRI), compressing the entire transcriptomic space into a single variable, and linking it with the inhaled multiwalled carbon nanotubes (MWCNTs) properties. This methodology allows us to predict fold change values of thousands of differentially expressed genes (DEGs) using a single variable and a single quantitative structure-activity relationship (QSAR) model. In the context of this work, TRI compressed 5167 DEGs into a single variable, explaining 99.

View Article and Find Full Text PDF

Similar Publications