First report on machine learning based multiclass classification of Caco-2 permeability using different balancing strategies.

SAR QSAR Environ Res

Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India.

Published: August 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Evaluating the permeability of different molecular structures across the Caco-2 cell line is crucial for drug discovery and development. The present study primarily focuses on developing machine learning-based multiclass classification models for predicting the permeability of molecules across the Caco-2 cell line. However, the class imbalance in permeability datasets poses a significant challenge for developing predictive models in the case of multiclass analysis. To address the class imbalance issue, we employed different balancing strategies, including oversampling, undersampling, and hybrid approaches, to balance the training set. A five-fold cross-validation approach was employed for optimizing the hyperparameters. After completion of the evaluation process, we concluded that the XGBoost multiclass classifier trained with ADASYN oversampling achieved the best performance (accuracy, 0.717; MCC, 0.512 on the test set). Additionally, extreme permeability classes were also classified separately, and the best model exhibited strong predictive performance (accuracy, 0.853; MCC, 0.703 on the test set). To enhance the interpretability of the best-performing models, we performed SHAP analysis to elucidate descriptor importance and provide explainability. Our findings demonstrate that appropriate data balancing strategies can significantly improve predictive performance in multiclass permeability classification, offering a valuable framework for drug permeability assessment.

Download full-text PDF

Source
http://dx.doi.org/10.1080/1062936X.2025.2552134DOI Listing

Publication Analysis

Top Keywords

balancing strategies
12
multiclass classification
8
caco-2 cell
8
class imbalance
8
performance accuracy
8
test set
8
predictive performance
8
permeability
7
multiclass
5
report machine
4

Similar Publications

Unlabelled: Concurrent presentation of pulmonary nocardiosis and granulomatosis with polyangiitis (GPA) is exceptionally rare and diagnostically challenging, given the overlapping clinical and radiological features. We report a 54-year-old female with fever, cough, weight loss, and arthralgia. Chest imaging showed multiple pulmonary nodules; serology revealed positive anti-neutrophil cytoplasmic antibodies -proteinase 3, and lung biopsy demonstrated necrotizing granulomatous inflammation with Nocardia species.

View Article and Find Full Text PDF

S-glutathionylation (SSG), a redox-sensitive post-translational modification mediated by glutathione, regulates protein structure and function through reversible disulfide bond formation at cysteine residues. Glutaredoxins (GRXs), pivotal antioxidant enzymes, catalyze SSG dynamics to maintain thiol homeostasis. Recent advances in redox proteomics have revealed that SSG dysregulation is intricately linked to neurodegenerative, cardiovascular, pulmonary and malignant diseases.

View Article and Find Full Text PDF

Medication-related osteonecrosis of the jaw (MRONJ) is a rare but well-recognized complication of treatment with antiresorptive agents. Medication-related osteonecrosis of the external auditory canal (MROEAC), on the other hand, is even rarer and mostly reported during bisphosphonate exposure. Its pathophysiology is thought to involve complex multifactorial processes, including inhibition of bone remodeling, altered angiogenesis, infection, and inflammation.

View Article and Find Full Text PDF

Background: The increasing prevalence of sports injuries among young female volleyball players, driven by biomechanical and hormonal factors, necessitates effective prevention strategies. Screening tools like the Functional Movement Screen (FMS) and Star Excursion Balance Test (SEBT) often show inconsistent predictive validity for injury risk in this population. This study investigates associations between FMS, SEBT, agility, and muscle strength with injury risk in young female volleyball players to refine prediction models and inform targeted interventions.

View Article and Find Full Text PDF

Gastrointestinal eubiosis is essential for maintaining overall host wellbeing. Post-weaning diarrhea (PWD) is a common issue in pig development, arising from weaning stress, which disrupts the gut microbiota balance and increases susceptibility to infections. The primary bacterial pathogen linked to PWD is enterotoxigenic (ETEC).

View Article and Find Full Text PDF