Hypothesis-free discovery of novel cancer predictors using machine learning.

Eur J Clin Invest

Australian Centre for Precision Health, Unit of Clinical and Health Sciences, University of South Australia, Adelaide, South Australia, Australia.

Published: October 2023


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Cancer is a leading cause of morbidity and mortality worldwide, and better understanding of the risk factors could enhance prevention.

Methods: We conducted a hypothesis-free analysis combining machine learning and statistical approaches to identify cancer risk factors from 2828 potential predictors captured at baseline. There were 459,169 UK Biobank participants free from cancer at baseline and 48,671 new cancer cases during the 10-year follow-up. Logistic regression models adjusted for age, sex, ethnicity, education, material deprivation, smoking, alcohol intake, body mass index and skin colour (as a proxy for sun sensitivity) were used for obtaining adjusted odds ratios, with continuous predictors presented using quintiles (Q).

Results: In addition to smoking, older age and male sex, positively associating features included several anthropometric characteristics, whole body water mass, pulse, hypertension and biomarkers such as urinary microalbumin (Q5 vs. Q1 OR 1.16, 95% CI = 1.13-1.19), C-reactive protein (Q5 vs. Q1 OR 1.20, 95% CI = 1.16-1.24) and red blood cell distribution width (Q5 vs. Q1 OR 1.18, 95% CI = 1.14-1.21), among others. High-density lipoprotein cholesterol (Q5 vs. Q1 OR 0.84, 95% CI = 0.81-0.87) and albumin (Q5 vs. Q1 OR 0.84, 95% CI = 0.81-0.87) were inversely associated with cancer. In sex-stratified analyses, higher testosterone increased the risk in females but not in males (Q5 vs. Q1 OR 1.23, 95% CI = 1.17-1.30). Phosphate was associated with a lower risk in females but a higher risk in males (Q5 vs. Q1 OR 0.94, 95% CI = 0.90-0.99 vs. OR 1.09, 95% CI 1.04-1.15).

Conclusions: This hypothesis-free analysis suggests personal characteristics, metabolic biomarkers, physical measures and smoking as important predictors of cancer risk, with further studies needed to confirm causality and clinical relevance.

Download full-text PDF

Source
http://dx.doi.org/10.1111/eci.14037DOI Listing

Publication Analysis

Top Keywords

machine learning
8
risk factors
8
hypothesis-free analysis
8
cancer risk
8
95%
8
084 95%
8
95% ci = 081-087
8
risk females
8
cancer
7
risk
6

Similar Publications

Background: Virtual reality (VR) and artificial intelligence (AI) technologies have advanced significantly over the past few decades, expanding into various fields, including dental education.

Purpose: To comprehensively review the application of VR and AI technologies in dentistry training, focusing on their impact on cognitive load management and skill enhancement. This study systematically summarizes the existing literature by means of a scoping review to explore the effects of the application of these technologies and to explore future directions.

View Article and Find Full Text PDF

Background: Hospital-acquired venous thromboembolism (HA-VTE) is a leading cause of morbidity and mortality among hospitalized adults. Numerous prognostic models have been developed to identify those patients with elevated risk of HA-VTE. None, however, has met the necessary criteria to guide clinical decision-making.

View Article and Find Full Text PDF

Background: Classification of rose species and verities is a challenging task. Rose is used worldwide for various applications, including but not restricted to skincare, medicine, cosmetics, and fragrance. This study explores the potential of Laser-Induced Breakdown Spectroscopy (LIBS) for species and variety classification of rose flowers, leveraging its advantages such as minimal sample preparation, real-time analysis, and remote sensing.

View Article and Find Full Text PDF

Multi-scale convolutional attention GRU network combined with improved CARS strategy to determine key elements in ores by XRF.

Anal Chim Acta

November 2025

School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, 611731, PR China; Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou, Zhejiang, 313001, PR China; Laboratory for Microwave Spatial Inte

Background: X-ray fluorescence (XRF) technology is a promising method for estimating the metal element content in ores, which helps in understanding ore composition and optimizing mining and processing strategies. However, due to the presence of a large number of redundant features in XRF spectra, traditional quantitative analysis models struggle to effectively capture the nonlinear relationship between element concentration and spectral information of XRF, making it more difficult to accurately predict metal element concentrations. Thus, analyzing ore element concentrations by XRF remains a significant challenge.

View Article and Find Full Text PDF