A multi-biomarker machine learning approach for early prediction of interstitial lung disease in rheumatoid arthritis.

BMC Pulm Med

Department of Rheumatology, Xi'an Fifth Hospital, 112 Xiguanzheng Street, Lianhu District, Xian, Shaanxi, 710000, People's Republic of China.

Published: August 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Interstitial lung disease (ILD) is a severe complication affecting 10-30% of rheumatoid arthritis (RA) patients. Current diagnostic methods typically detect ILD only after substantial lung damage has occurred. This delay emphasizes the need for early detection strategies. This study aims to develop and validate machine learning models for early RA-ILD prediction and identify key predictive biomarkers.

Methods: We conducted a cross-sectional study enrolling 149 RA patients (84 with ILD, 65 without ILD) between January 2020 and December 2023. We evaluated demographic characteristics, clinical parameters, and laboratory markers, including inflammatory indicators, hematological parameters, and specific biomarkers. We developed and compared four machine learning (ML) models (XGBoost, Random Forest, Support Vector Machine, and Logistic Regression) for ILD prediction capabilities.

Results: The XGBoost model demonstrated superior predictive performance (AUC = 0.891, 95% CI: 0.847-0.935). Feature importance analysis identified Krebs von den Lungen-6 (KL-6) as the strongest predictor (importance score = 0.285), followed by interleukin-6 (IL-6) and cytokeratin 19 fragment (CYFRA21-1). The ILD group exhibited significantly elevated levels of inflammatory markers and specific biomarkers, particularly KL-6 (826.4 ± 458.2 vs. 285.6 ± 124.8 U/ml, P < 0.001), alongside distinct patterns in hematological parameters.

Conclusion: Machine learning approaches, particularly XGBoost, demonstrate promising potential for early RA-ILD prediction. The integration of KL-6 and other identified biomarkers into clinical screening protocols may facilitate early detection and improved patient outcomes. These findings suggest that machine learning models could serve as valuable tools for risk stratification and early intervention in RA-ILD management, providing new approaches for individualized risk assessment in clinical practice.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12355783PMC
http://dx.doi.org/10.1186/s12890-025-03855-yDOI Listing

Publication Analysis

Top Keywords

machine learning
12
interstitial lung
8
lung disease
8
rheumatoid arthritis
8
learning models
8
specific biomarkers
8
ild
6
multi-biomarker machine
4
learning approach
4
approach early
4

Similar Publications

Background: A clear understanding of minimal clinically important difference (MCID) and substantial clinical benefit (SCB) is essential for effectively implementing patient-reported outcome measurements (PROMs) as a performance measure for total knee arthroplasty (TKA). Since not achieving MCID and SCB may reflect suboptimal surgical benefit, the primary aim of this study was to use machine learning to predict patients who may not achieve the threshold-based outcomes (i.e.

View Article and Find Full Text PDF

Arthroplasty surgery is a common and successful end-stage intervention for advanced osteoarthritis. Yet, postoperative outcomes vary significantly among patients, leading to a plethora of measures and associated measurement approaches to monitor patient outcomes. Traditional approaches rely heavily on patient-reported outcome measures (PROMs), which are widely used, but often lack sensitivity to detect function changes (e.

View Article and Find Full Text PDF

Automatic markerless estimation of infant posture and motion from ordinary videos carries great potential for movement studies "in the wild", facilitating understanding of motor development and massively increasing the chances of early diagnosis of disorders. There has been a rapid development of human pose estimation methods in computer vision, thanks to advances in deep learning and machine learning. However, these methods are trained on datasets that feature adults in different contexts.

View Article and Find Full Text PDF

This study aims to investigate the predictive value of combined phenotypic age and phenotypic age acceleration (PhenoAgeAccel) for benign prostatic hyperplasia (BPH) and develop a machine learning-based risk prediction model to inform precision prevention and clinical management strategies. The study analyzed data from 784 male participants in the US National Health and Nutrition Examination Survey (NHANES, 2001-2008). Phenotypic age was derived from chronological age and nine serum biomarkers.

View Article and Find Full Text PDF

Bariatric surgery is an effective treatment for morbid obesity, but patient outcomes differ greatly because of a variety of phenotypes, comorbidities, and postoperative adherence. In bariatric care, artificial intelligence (AI) and machine learning (ML) are becoming revolutionary tools because traditional predictive models based on BMI and demographic variables are unable to account for these complexities. To put it simply, AI is a branch of computer science that enables machines to perform tasks that typically require human intelligence.

View Article and Find Full Text PDF