Random forests for the analysis of matched case-control studies.

BMC Bioinformatics

Institute of Medical Biometry, Informatics and Epidemiology, Faculty of Medicine, University of Bonn, Bonn, Germany.

Published: August 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Conditional logistic regression trees have been proposed as a flexible alternative to the standard method of conditional logistic regression for the analysis of matched case-control studies. While they allow to avoid the strict assumption of linearity and automatically incorporate interactions, conditional logistic regression trees may suffer from a relatively high variability. Further machine learning methods for the analysis of matched case-control studies are missing because conventional machine learning methods cannot handle the matched structure of the data.

Results: A random forest method for the analysis of matched case-control studies based on conditional logistic regression trees is proposed, which overcomes the issue of high variability. It provides an accurate estimation of exposure effects while being more flexible in the functional form of covariate effects. The efficacy of the method is illustrated in a simulation study and within an application to real-world data from a matched case-control study on the effect of regular participation in cervical cancer screening on the development of cervical cancer.

Conclusions: The proposed random forest method is a promising add-on to the toolbox for the analysis of matched case-control studies and addresses the need for machine-learning methods in this field. It provides a more flexible approach compared to the standard method of conditional logistic regression, but also compared to conditional logistic regression trees. It allows for non-linearity and the automatic inclusion of interaction effects and is suitable both for exploratory and explanatory analyses.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11292918PMC
http://dx.doi.org/10.1186/s12859-024-05877-5DOI Listing

Publication Analysis

Top Keywords

matched case-control
24
conditional logistic
24
logistic regression
24
analysis matched
20
case-control studies
20
regression trees
16
trees proposed
8
standard method
8
method conditional
8
high variability
8

Similar Publications

Objective: Identify social/metabolic risk factors associated with subsequent diagnosis of adrenal adenoma.

Design: Population-based historical case-control study.

Methods: Cases were adult patients diagnosed with an adrenal adenoma between 2005-2017 with no overt hormone excess.

View Article and Find Full Text PDF

Background: Livestock-MRSA (methicillin-resistant Staphylococcus aureus) can cause infections in persons without known contact to livestock, but the route of transmission is unclear. We investigated whether the risk of livestock-MRSA infection among persons with no known contact to livestock is associated with the number of pig farms near the home, and whether this association is affected by the upwind/downwind location of the farms.

Methods: Register-based case-control study of 518 persons from Denmark with clinical infections with livestock-MRSA in 2016-2021 and no known exposure to livestock, and 4,944 matched controls.

View Article and Find Full Text PDF

Systemic lupus erythematosus (SLE) is a chronic autoimmune disease characterized by complex disturbances in both innate and adaptive immune responses, often leading to multi-organ involvement. One of the key features of SLE pathogenesis is endothelial dysfunction, which contributes to immune cell infiltration and vascular inflammation. In this context, adhesion molecules such as platelet endothelial cell adhesion molecule-1 (PECAM-1), intercellular adhesion molecule-1 (ICAM-1), and vascular cell adhesion molecule-1 (VCAM-1) may reflect the degree of endothelial activation.

View Article and Find Full Text PDF

In Canada, the incidence of human papillomavirus (HPV)-related head and neck cancer (HNC) is increasing. The role of multiple oral HPV infections in HNC etiology remains unclear, and evidence of HPV vaccination's effectiveness in reducing HNC incidence is limited. We investigated oral HPV co-infection patterns, estimated the association between multiple oral HPV infections and HNC risk, and the effect of eliminating vaccine-targeted HPV genotypes on HNC incidence.

View Article and Find Full Text PDF

Background: Gastrointestinal bleeding (GiB) is associated with hypoperfusion, cytokine release, and alterations to the mucosal barrier frequently seen in the critical care population. Risk factors in the population at large have been well-studied, but few have specifically addressed the unique circumstances surrounding critically ill trauma patients. We aimed to evaluate the incidence and risk factors for GiB in the trauma critical care population.

View Article and Find Full Text PDF