A Cross-validated Ensemble Approach to Robust Hypothesis Testing of Continuous Nonlinear Interactions: Application to Nutrition-Environment Studies.

Jeremiah Zhe Liu , Wenying Deng , Jane Lee , Pi-I Debby Lin , Linda Valeri , David C Christiani , David C Bellinger , Robert O Wright , Maitreyi M Mazumdar , Brent A Coull

J Am Stat Assoc

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

Published: September 2021

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Gene-environment and nutrition-environment studies often involve testing of high-dimensional interactions between two sets of variables, each having potentially complex nonlinear main effects on an outcome. Construction of a valid and powerful hypothesis test for such an interaction is challenging, due to the difficulty in constructing an efficient and unbiased estimator for the complex, nonlinear main effects. In this work we address this problem by proposing a Cross-validated Ensemble of Kernels (CVEK) that learns the space of appropriate functions for the main effects using a cross-validated ensemble approach. With a carefully chosen library of base kernels, CVEK flexibly estimates the form of the main-effect functions from the data, and encourages test power by guarding against over-fitting under the alternative. The method is motivated by a study on the interaction between metal exposures and maternal nutrition on children's neurodevelopment in rural Bangladesh. The proposed tests identified evidence of an interaction between minerals and vitamins intake and arsenic and manganese exposures. Results suggest that the detrimental effects of these metals are most pronounced at low intake levels of the nutrients, suggesting nutritional interventions in pregnant women could mitigate the adverse impacts of metal exposures on children's neurodevelopment.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9611147	PMC
http://dx.doi.org/10.1080/01621459.2021.1962889	DOI Listing

Publication Analysis

Top Keywords

cross-validated ensemble

main effects

ensemble approach

nutrition-environment studies

complex nonlinear

nonlinear main

kernels cvek

metal exposures

children's neurodevelopment

approach robust

Similar Publications

Performance of Cross-Validated Targeted Maximum Likelihood Estimation.

Stat Med

July 2025

Inequalities in Cancer Outcomes Network, London School of Hygiene and Tropical Medicine, London, UK.

Matthew J Smith , Rachael V Phillips , Camille Maringe , Miguel Angel Luque-Fernandez

Background: Advanced methods for causal inference, such as targeted maximum likelihood estimation (TMLE), require specific convergence rates and the Donsker class condition for valid statistical estimation and inference. In situations where there is no differentiability due to data sparsity or near-positivity violations, the Donsker class condition is violated. In such instances, the bias of the targeted estimand is inflated, and its variance is anti-conservative, leading to poor coverage.

View Article and Find Full Text PDF

Similar Publications

GenAI exceeds clinical experts in predicting acute kidney injury following paediatric cardiopulmonary bypass.

Sci Rep

July 2025

Bristol Royal Hospital for Children, Bristol, UK.

Mansour Sharabiani , Alireza Mahani , Alex Bottle , Yadav Srinivasan , Richard Issitt

The emergence of large language models (LLMs) opens new horizons to leverage, often unused, information in clinical text. Our study aims to capitalise on this new potential. Specifically, we examine the utility of text embeddings generated by LLMs in predicting postoperative acute kidney injury (AKI) in paediatric cardiopulmonary bypass (CPB) patients using electronic health record (EHR) text, and propose methods for explaining their output.

View Article and Find Full Text PDF

Similar Publications

Predicting Rheological Properties of Asphalt Modified with Mineral Powder: Bagging, Boosting, and Stacking vs. Single Machine Learning Models.

Materials (Basel)

June 2025

School of Transportation and Logistics Engineering, Wuhan University of Technology, Wuhan 430000, China.

Haibing Huang , Zujie Xu , Xiaoliang Li , Bin Liu , Xiangyang Fan

This study systematically compares the predictive performance of single machine learning (ML) models (KNN, Bayesian ridge regression, decision tree) and ensemble learning methods (bagging, boosting, stacking) for quantifying the rheological properties of mineral powder-modified asphalt, specifically the complex shear modulus (G*) and the phase angle (). We used two emulsifiers and three mineral powders for fabricating modified emulsified asphalt and conducting rheological property tests, respectively. Dynamic shear rheometer (DSR) test data were preprocessed using the local outlier factor (LOF) algorithm, followed by K-fold cross-validation (K = 5) and Bayesian optimization to tune model hyperparameters.

View Article and Find Full Text PDF

Similar Publications

Dynamic weighted ensemble model for predictive optimization in green sand casting: Advancing industry 4.0 manufacturing.

MethodsX

June 2025

Department of Artificial Intelligence and Machine Learning, Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, Maharashtra, India.

Rajesh V Rajkolhe , Dr Sanjay S Bhagwat , Dr Priyanka V Deshmukh

This research presents an enhanced predictive model for green sand casting, designed to tackle the nonlinear complexities arising from interdependent process parameters. Casting defects substantially affect product quality and rejection rates, making accurate prediction vital. To overcome the limitations of individual machine learning models and static ensemble strategies, a novel Dynamic Weighted Ensemble (DWE) model is introduced.

View Article and Find Full Text PDF

Similar Publications

Deep Learning Model for Histologic Diagnosis of Dysplastic Barrett's Esophagus: Multisite Cohort External Validation.

Am J Gastroenterol

April 2025

Barrett's Esophagus Unit, Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota, USA.

D Chamil Codipilly , Shahriar Faghani , David Vogelsang , Mana Moassefi , Nikita Garg

Introduction: The risk of progression to esophageal adenocarcinoma (EAC) in Barrett's esophagus (BE) increases with advancing degrees of dysplasia. There is a critical need to improve the diagnosis of BE dysplasia, given substantial interobserver variability and overcalls of dysplasia during manual community pathologist reads. We aimed to externally validate a previously cross-validated BE dysplasia diagnosis deep learning model (BEDDLM) that predicts dysplasia grade on whole slide images (WSIs).

View Article and Find Full Text PDF

Similar Publications