Machine Learning Models Based on Enlarged Chemical Spaces for Screening Carcinogenic Chemicals.

Chem Res Toxicol

Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China.

Published: July 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Machine learning (ML) models for screening carcinogenic chemicals are critical for the sound management of chemicals. Previous models were built on small-scale datasets and lacked applicability domain (AD) characterization that is necessary for regulatory applications of the models. In the current study, an enlarged dataset containing 1697 compounds (940 carcinogens and 757 non-carcinogens) was curated and employed to construct screening models based on 12 types of molecular fingerprints, four ML algorithms, and two graph neural networks. The AD of the optimal model was defined by a state-of-the-art characterization methodology (AD) based on the analysis of structure-activity landscapes (SALs). Results showed that an optimal model based on the random forest algorithm with the PubChem fingerprints outperformed previous ones, with an area under the receiver operating characteristic curve of 86.2% on the validation set imposed with the AD. The optimal model, coupled with the AD, was employed to screen carcinogenic chemicals in the Inventory of Existing Chemical Substances of China (IECSC) and plastic additives datasets, identifying 1282 chemicals from the IECSC and 841 plastic additives as carcinogenic chemicals. The screening model coupled with AD may serve as a promising tool for prioritizing chemicals of carcinogenic concern, facilitating the sound management of chemicals.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.chemrestox.4c00523DOI Listing

Publication Analysis

Top Keywords

carcinogenic chemicals
16
optimal model
12
machine learning
8
learning models
8
models based
8
screening carcinogenic
8
chemicals
8
sound management
8
management chemicals
8
model coupled
8

Similar Publications

Radon (Rn) is a naturally occurring radioactive gas produced by the decay of uranium-bearing minerals in rocks and soils. Long-term exposure to elevated radon levels in drinking water is associated with an increased risk of stomach and lung cancers. This study aims to assess the concentration of radon in groundwater and evaluate its potential health risks in six cancer-affected districts, i.

View Article and Find Full Text PDF

Characterization of the extrinsic and intrinsic signatures and therapeutic vulnerability of small cell lung cancers.

Signal Transduct Target Ther

September 2025

State Key Laboratory of Molecular Oncology & Department of Medical Oncology & Department of Pathology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.

Small-cell lung cancer (SCLC), an aggressive neuroendocrine tumor strongly associated with exposure to tobacco carcinogens, is characterized by early dissemination and dismal prognosis with a five-year overall survival of less than 7%. High-frequency gain-of-function mutations in oncogenes are rarely reported, and intratumor heterogeneity (ITH) remains to be determined in SCLC. Here, via multiomics analyses of 314 SCLCs, we found that the ASCL1/MKI67 and ASCL1/CRIP2 clusters accounted for 74.

View Article and Find Full Text PDF

Heart failure (HF) and lung cancer (LC) often coexist, yet their shared molecular mechanisms are unclear. We analyzed transcriptome data from the NCBI Gene Expression Omnibus (GEO) database (GSE141910, GSE57338) to identify 346 HF‑related differentially expressed genes (DEGs), then combined weighted gene co-expression network analysis (WGCNA) pinpointed 70 hub candidates. Further screening of these 70 hub candidates in TCGA lung cancer cohorts via LASSO, Random Forest, and multivariate Cox regression suggested CYP4B1 as the only independent prognostic marker.

View Article and Find Full Text PDF

MOF-engineered activated carbon adsorbent enabling semi-selective ethyl carbamate removal in fermented foods.

Food Res Int

November 2025

Innovation Center for Advanced Brewing Science and Technology, College of Biomass Science and Engineering, Sichuan University, Chengdu 610065, PR China; National Engineering Research Center of Solid-state Brewing, Luzhou Laojiao Co. Ltd, Luzhou 646000, China; Key Laboratory of Monitoring and Assessm

Fermented foods are valued for their diverse flavor and health benefits, but the formation of ethyl carbamate (EC), a potential carcinogen, during production and storage poses challenges. Current EC reduction methods often compromise flavor and bioactive components. This study exemplifies a novel adsorbent combining activated carbon with metal-organic framework (MOF) chemistry for semi-selective EC removal.

View Article and Find Full Text PDF

A novel dual-mode sensing system integrating a magnetic core-shell CuFeO/Cu/MnO nanozyme with a stimuli-responsive agarose-deep eutectic solvent hydrogel (DES-Aga) is reported. The nanozyme exhibits exceptional oxidase-like activity, characterized by a low Michaelis constant (K = 0.14 mM) and high catalytic efficiency (V = 1.

View Article and Find Full Text PDF